Just For Fun: The Story of Linus Torvalds

For the last couple of weeks, my bedside reading has been this half-biography, half-autobiography on Linus Torvalds. I have to say, however, that the book is like two books mixed into one. Chapters alternate between Linus talking about his life and about big moments in Linux’s history to David Diamond describing modern-day Linus with a kind of forced wonder. Truthfully, Diamond comes off as a sycophant who could care less about Linus’s flaws and positive characteristics, and cares more about molding some kind of “image” of Linus as containing a humility and genius simultaneously. Near the end, I started only skimming the chapters not written by Linus. Diamond’s really not a good writer, either. (Sorry Dave.)

Truthfully, the book kind of pops the lid off Linux and makes you understand it as much less glamorous than say Wired Magazine described it to the public. Linus really just talks about not having a social life, sitting in his room with curtains covering his window, coding all day. Not exactly the ideal role model, I think. Don’t get me wrong, I love the Linux kernel (as much as one can love imperfect software), and Linus made a great contribution toward keeping the UNIX world and UNIX principles alive, but it’s just that I like to think of open source developers as something other than the stereotypical, introverted geek. In fact, much of Linus’s chapters is devoted to his apprehension about giving a public talk about Linux. When I think about the fact that I’ve given three or four of them to date, and enjoy it more every time, I see how different I am from this kind of stereotypical geek.

It also kind of made me dislike Linus. When I saw Revolution OS (a DVD on the rise of open source), the movie kind of endeared me to Linus’s practical nature as opposed to Richard Stallman’s religious idealism. I like idealism, but Stallman is really religious about it. And he’s bitter. Linus, on the other hand, has that great Northern European, “I’m just gonna go with the flow” attitude.

But this book made me realize that Linus is religious is his own sort of way. Included in the book is Linus’s flame war with Andy Tanenbaum on monolithic versus microkernel designs. Truthfully, I’ve studied operating systems and I’m not even sure which design is best, and Linus makes a decent argument of why microkernels end up being just as complex, or more complex than monolithic ones. But what I didn’t like is that in the flamefest, Tanenbaum said that deficiencies in MINIX were due to it being a hobby, and that he had duties as a professor. Linus responded, “Re 2: your job is being a professor and researcher: That’s one hell of a good excuse for some of the brain-damages of minix. I can only hope (and assume) that Amoeba [Tanenbaum’s future OS project] doesn’t suck like minix does.”

This just shows me that Linus really is an asshole sometimes. He states this outright in his book. So now, truthfully, I may like the open source movement, but I think I “at least dislike” two of its most major players (Torvalds and Stallman).

Finally, I think a clip from Tanenbaum’s website points out a nice principal in OS design:

Also, Linus and I are not “enemies” or anything like that. I met him once and he seemed like a nice friendly, smart guy. My only regret is that he didn’t develop Linux based on the microkernel technology of MINIX. With all the security problems Windows has now, it is increasingly obvious to everyone that tiny microkernels, like that of MINIX, are a better base for operating systems than huge monolithic systems. Linux has been the victim of fewer attacks than Windows because (1) it actually is more secure, but also (2) most attackers think hitting Windows offers a bigger bang for the buck so Windows simply gets attacked more. As I did 20 years ago, I still fervently believe that the only way to make software secure, reliable, and fast is to make it small. Fight Features.

I agree. But does a microkernel design actually reduce the overall size of the operating system, or does it just reduce the size of whatever you consider to be the “microkernel”? That is, just because a file system is implemented as a file system daemon talking to a driver subsystem through message passing doesn’t necessarily mean the file system, or driver subsystem, are secure. Insecurity could exist even at the boundaries, no? Not to mention instability.

I think Linus and Tanenbaum have to agree that this debate isn’t an open and shut case. The best kernel is probably one that mixes modularity, a strong kernel/userspace boundary, and some of the fancier features of a microkernel approach, while not sacrificing elegance of design or performance.

Free Coders at NYU

I’m organizing a group of people interested in hacking open source software in a team environment. Right now I’m calling it Free Coders at NYU, and have already set up a wiki and mailing list. This could end up being very cool. Next meeting is hopefully this coming Tuesday.

I set up a mailing list with GNU Mailman (link above), which was decently painless under Debian Sarge. The only annoying thing was utilizing my virtual e-mail address mappings which are stored in MySQL, but I figured out a trick for that.

I’ve already spoken, via e-mail, with an open source developer who works on gstreamer among other projects, Ronald S. Bultje. He has already tentatively agreed to do a talk for us sometime this year.

She says she’s 18, but you’re a sexual predator and pedophile

In this article you see quite an amazing statement by one of the guys who busts “sexual criminals”…

“These girls are only 13, 14 or 15 saying they are 18. Some of the things they are writing are leaving them open to sexual predators and pedophiles,” said Drass.

Excuse me, but if I have sex with a girl who tells me she’s 18, I may have committed a crime, but I did so unknowingly. Having sex with someone who tells you she is of the age of consent but isn’t doesn’t make you a sexual predator or a pedophile. Sexual predators and pedophiles seek out underage people because they are easy victims.

But it’s strange–these cops are so singly-focused in making big busts that they can hardly tell the difference between a pedophile and someone who has sex with someone underage but never knew the girl’s real age.

Shouldn’t the parents have to take any responsibility if their children are pretending to be older than they are and thus having illegal sex with people? You can only go so far “shielding” your kids from harms before you just have to sit down with them and tell them the reality of life. But most parents are so thick-headed they don’t even want to talk to their children about sex (hell, most parents have never even let their kids see them naked). Is anyone thinking that perhaps the taboo we have on sexuality in this country is what leads kids to have such strange misconceptions, so that eventually there are 14 year olds who think it’s cool to list on their profile, “I like rough sex and I like it 10 times a day.” Why don’t parents just talk to their kids, it’s the only god-damn reason you’re a parent in the first place.

Why do parents think leaving kids on the Internet unattended is any different from letting your kids walk into the public square unattended, or make random telephone calls unattended? The Internet is a way to contact people, and communicate with them; often, people you do not know. An “innocent child” who cannot make judgements for him/herself should simply not be on there without a full armament of knowledge about the reality of the situation.

Meanwhile, the real victim here isn’t the 14 year old who thought it would be cool to have sex and so lied about her age to lose her virginity, but the guy who sincerely thought she was 18 and ends up going to jail for it.

Corporate Pork in the Age of “Homeland Security”

As reported on most major news stations, Air America, and Slashdot, Lockheed Martin was awarded a big $212 million contract to install thousands of cameras in NYC’s subway system and a wireless network which, incidentally, will not work in moving cars. I don’t know whether the cameras themselves will actually work in the cars (it seems to me if one seems a technical hurdle than the other will as well), but that remains to be seen.

I know this almost goes without saying, but this is really a waste of taxpayer dollars. People will say this is a good step, that anything goes to make them feel safer, but in the end, we have to think about the facts.

9/11 didn’t happen because of a failure of security or intelligence. It happened because of a failure of imagination. We’ve said this time and time again, but perhaps now we’re forgetting just how surprised we were that terrorists decided to hijack our airplanes and fly them into our buildings while we were worrying about trucks full of explosives being driven into the underground parking garage.

People have worried about subways being a terrorist target for years, even before 9/11. Therefore, it’s quite likely they won’t be a target. It will more likely be an unattended package in Times Square, where it’s crowded and relatively light on security, or a smuggled package into Carnegie Hall, where the well-to-do nature of the crowd makes no one suspect anything, or any other number of possible things that are completely not obvious. Because protecting against a terrorist is ultimately futile, because smart ones will obviously choose means that you didn’t think of, then why take these measures at all?

Well, one reason is because people in public policy feel this pressure to do something, so that when something does happen, they won’t be fired on the grounds of taking no steps to counter terrorism. Then, we hand $200 million dollars over to a corporation that already lives and breathes on our taxpayer dollars for fighter jets and missiles, and we never look back.

In return for this false sense of safety, we get other hidden harms. Invasion of privacy? Check. Feeling like you live in a police state? Check. $200 million dollars we could have spent on health care, education, or retirement benefits? Check.

How about when the new “anti-terrorism” cameras start being used to spot young black kids who might be carrying marijuana, so we can lock them up? Are there legal exemptions in this system if, when approaching a person for suspection as a potential terrorist, finding a bit of marijuana isn’t admissable as evidence against this person? I doubt it. It’s probably just like the cameras in the parks around New York; installed, supposedly, to prevent rape, but used most often to bust drug deals.

My other concern is much more practical. These cameras won’t work. I heard the woman who sponsored the project for the MTA saying the purpose was to be able to find a suspicious package, identify it, and dispatch bomb sniffing dogs to “take care of the situation.”

You must be kidding, right?

First of all, whoever will be manning the camera stations, if they are anything like the luggage screeners in the airports, I very much doubt they will notice “suspicious packages” when we need them to. Second, following the trends of most modern terrorists, you’ll be looking at a suspicious bag at the West 4th Street station, while a young man wearing a backpack suddenly explodes.

What if the coordinated terrorists decide to drop “suspicious packages” all over most of the subways in Manhattan, at about the same time. 30 suspicious packages across New York. They’ll only actually blow up 10 of them, but you’ll be spread so thin by that point that you won’t even know how to respond.

Do you see what I’m getting at? How futile is this stuff? I know it’s hard to accept, I know it’s cold and maybe downright mean, and you may be saying, “Andrew, you’re full of shit, you don’t understand this at all,” but this is what I say to all this spending:

Fuck it. Fuck it all. Don’t spend a god-damn dime on pre-empting a terrorist attack.

Spend it, instead, on providing health care for sick Americans. Making sure the unemployed get employed so they don’t turn to crime. Focusing on education in poor neighborhoods where crime is common. In the end, you spend $200 million dollars in any of those, and you’ll probably save a few hundred lives every year, and at least we can measure it, and at least I don’t have to sacrifice my civil liberties for it.

In this country, we spend over $400 billion on defense. That’s more than our combined spending for Education, Housing, Justice, Housing Assistance, Environment, Employment, Science/space and Transportion, among other things. And it’s not just slightly more; it’s $100 billion more.

Keeping applications open

Just an interesting post I made to OSNews in response to someone saying that IE “starts quickly” while Firefox “takes forever.”

Just to clear up, the only reason IE starts faster in Windows is because IE is technically “always running.” The only thing that has to “start” is creating a Window with an “IE control” in it.

I get the same behavior on Linux by running galeon -s when my X session starts. This runs galeon in “server mode,” which means it’s always in memory, and when I run Galeon (on my laptop, I press ALT+F1 to run my browser), it starts in < a half-second. If Firefox had a similar mode, it could offer you the same thing. As for OpenOffice.org, it's true that the start time is relatively slow. I'm sure they'll get around to optimizing it. Personally, I think the obsession people have with start times on Linux and Windows machines is due to a basic design flaw with most Window managers. Applications should really only start up once; if you start an application multiple times in a day, you're essentially performing redundant computation. The program can sit in memory and if it really is not used in awhile, it will get paged out anyway due to our modern Virtual Memory implementations. In OS X, for example, you can get the same effect as "galeon -s" or IE's "preloading" simply by not quitting an application after all its windows are closed. This leaves the application running, and when you open a new window it will be nearly instantaneous. (Strangely enough, many old Windows/Linux freaks are sometimes "annoyed" by this aspect of OS X, since in the Linux/Windows world up to now, closing all windows of an application is equivalent to closing the application itself).

GNU ddrescue and dd_rescue and dd_rhelp, what the?

Wow. I hate when shit like this happens.

Apparently there are three tools out there to help with the same thing. First, there’s dd_rescue, the tool I was using earlier (which ships with Ubuntu in a debian package called… ddrescue). Then, there’s dd_rhelp, a shell script which is a frontend to ddrescue and which implements a rough algorithm to minimize the amount of time waiting on bad block reads.

Then, there’s GNU ddrescue, which is a C++ implementation of dd_rescue plus dd_rhelp.

I only just realized this and so now I’ve compiled a version of GNU ddrescue to pick up my recovery effort. It’ll probably help with one of the partitions that seems particularly messed up.

So far the nice thing about GNU ddrescue is that it seems faster, and more responsive. Plus, it has a real logging feature, such that if you enable it and then CTRL+C the app, you can restart it and it’ll automatically pick up where it left off.

UPDATE: wow, good thing I switched. GNU ddrescue is significantly faster just in terms of raw I/O performance. I jumped from 4GB of this partition being rescued (which took 30 minutes with dd_rescue) to 6GB in the last ten minutes. It seems at least 3x faster. I also like that the GNU info page describes the algorithmic approach in-depth.

Fried hard disk ruins weekend

So, one of my employers ended up with a fried hard disk, for the second time in a row. The main reason is that the PC this HD is contained in sits in a corner with little-to-no airflow.

In order to recover the drive, I am actually taking a different approach from my last recovery effort, mainly by necessity. This disk is seriously damaged–lots of bad sectors, and its partitions are not readable by any NTFS driver, be it Microsoft’s or the open source one. This makes simply using the wonderful R-Studio tool I used last time currently impossible, due to the fact that it won’t even see the drive properly within Windows, and will hang all over the place.

Indeed, what I needed to do is drop down a layer of abstraction: away from filesystems, and into blocks and sectors. Unfortunately, in the Windows world this drop down is difficult, so I had to use my Linux laptop to make this jump.

I found a wonderful tool to help me out called dd_rescue, which is basically a dd with the added features of continuing on error, allowing one to specify a starting position in the in/out files, and the ability to run a copy in reverse. These features allow one to really work around bad sectors and even damaged disk hardware to get as much data as possible out.

Unfortunately, the use of this tool was encumbered by my laptop’s relatively simple bus design. Apparently, if I stuck two devices on my USB bus (like two HDs I was using for this process), the bus would slow to a crawl, and the copy would move along at an unbearble 100kB/sec. I tried utilizing firewire and USB together, but got only marginal improvements. What befuddles me is that in the end, the fastest combination I could come up with is reading from the Firewire enclosure with my laptop and writing to the firewire enclosure of my desktop across the LAN utilizing Samba. Very strange indeed. Now my performance is more like 6MB/sec, factoring in all the breaks dd_rescue takes when it encounters errors. I have 6GB of the more critical partition written, but it’ll probably take a couple hours to have a big enough chunk that I can test R-Studio’s recovery of it.

The only reason I’m even writing about this is because I find it hilarious how many layers of abstraction I am breaking through to do a relatively low-level operation. Think about it:

  1. My broken IDE drive is converted to Firewire by a Firewire-IDE bridge.
  2. My Firewire PCMCIA adapter is allowing my notebook to take in that connection.
  3. The Linux kernel is allowing firewire to be accessed via various ieee1394 ohci drivers.
  4. The Linux kernel is abstracting the firewire disk as a SCSI disk, using emulation.
  5. The SCSI disk is being read by dd_rescue and written to a file, which exists in the path /mnt/smb/image/sdb5.
  6. That path seems local, but is actually a mount point. That mount point seems physical but is actually handled by a Samba driver.
  7. The writes by dd_rescue to that image file are being sent through the kernel’s TCP/IP stack, and flying through my switch, and being accepted by Windows XP’s network stack.
  8. Windows XP is writing that data to an NTFS drive, which is itself connected by a Firewire-IDE bridge (and therefore all the above steps’ equivalents for Windows apply).

I am surprised with that many layers, that this copy is even working. I really should have just taken a machine apart and connected these drives directly by IDE, to save myself a few layers.

On the security of an e-mail address

I was just looking at my strange contact page, where I list my e-mail address using a sort of obfuscated string with _ and * characters mixed in. And then I saw someone’s e-mail address listed on the web with the following format:

user () domain ! com.

At that point, I started to think about all the other variations of this spam-protection trend I’ve seen, like user ///at\\\ domain ///dot\\\ com, and I realized that many of us are taking the wrong approach. Myself included. For example, the one above could easily be found by knowing the common TLDs and working backwards from there. If I find a “com”, “org” or “net,” and then look at the string tokens which occur before, I can assume any string of valid characters (say, alphanumeric characters) which is followed by whitespace or invalid chars (like parentheses and exclamation points) can be taken as a valid part of the address. From there, we can easily split user () domain ! com into its proper parts, and construct the e-mail. This same approach works for say, user ///at\\\ domain ///dot\\\ com.

So what I realized is perhaps it would be better to insert other e-mail addresses in there that might get picked up as part of an e-mail address, even in a heuristic scan. For example,

user __at__ domain :: NOT [email protected] :: __dot__ com

That seems more secure to me 😉 Another approach is just to prevent the TLD from being a complete token. This is the approach I took. Turn com into c_o__m or something, and you’re less likely to get picked up in a scan that is searching for “com”.