IN THE NEWS
Every once in a while, my two careers--first the counterterrorism/counterinsurgency studies, then the software industry--intersect. I look for the opportunity to put one set of skills at the service of the other, but it's surprisingly not as easy as you'd think.
For a moment, I encountered just such an intersection of these disparate worlds when I was in the car listening to an American Radio Works special about the expansion of surveillance since 9/11. Not surprisingly, the producers interviewed John Poindexter, the person who headed the Total Information Awareness (TIA) initiative, a federal research and development looking for ways to fix some of the informational cracks that the 9/11 terrorist slipped through.
By now, we've all heard about how, in 2001, different federal agencies--the FBI, the CIA, INS, and so forth--had bits and pieces of information about the al Qaeda hijackers. None of them had the complete picture, so the warning bells weren't ringing, or not loudly enough. Aside from human barriers, such as the FBI's signature unwillingness to work with other agencies, there were technical obstacles to building a complete picture about Mohammed Atta group's members and plans. Where the FBI maintained one database of information, the FAA and other agencies all had their own, separate repositories of information.
Aha! I thought. Here's a classic problem for people in my current field, database applications. The challenges for linking these databases together are often hard, but they're well-known--as are the solutions. The US government could never achieve a "God's eye view" of every possible threat, but perhaps it could better pool and analyze data from different sources. The privacy issues are significant, but again, people who have built software based on databases have tackled these problems already. Just as your bank and credit card companies have information "firewalls" when they share information with one another, so too could the federal databases tracking terrorist suspects.
While the details around connecting databases with one another may have dry details understood only by specialists, the basic challenges are rather simple to understand. For example:
- Information is often out of date. Whenever a suspect moves to a new address, changes cell phone carriers, or receives a wire transfer from overseas, someone has to enter the new information, or an automated system of some sort has to send the update to the database. Both ways of updating information are prone to failure. Law enforcement agents are slow to type new information into the system, or the automatic updates don't happen as scheduled.
- Different databases record the same tpe of information in different ways. Every database is like a file cabinet: how you organize information may not match the ssytem someone else might devise. There might be standard, obvious categories for some things--medical and dental records, credit card receipts, etc.--but others, like school records, might not be so obvious. Similarly, what type of information different federal agencies store, and the categories they use, may not match what other agencies have devised.
- Different conventions for the same type of information. In one office, people may flag documents that are urgent by putting red Post-It notes on the front page. Others might put them on the top of the pile. Similarly, a database for one federal agency might flag the possible risk a suspect poses to national security on a 0-to-5 scale, with 0 being the most dangerous person to watch, whil a different agency might develop a scale from 1 to 10, with 10 being the most dangerous.
While these problems might sound trivial, imagine dealing with them across several complex databases, including different versions from different software developers, some so old that they are not still supported, containing millions of individual records...
I'd expect to hear that federal officials were struggling with these sort of challenges. I didn't expect to hear that the person in charge of a program that was supposed to be working on these problems didn't understand the basics of them. Here's what that person, John Poindexter, said in the interview:
Because of the problem that we talked about earlier, of having the signal out there in all of this noise about innocent transactions, we recognized from the very beginning that privacy was going to be a huge issue. And, so, in addition to working on the technologies to find information, to find the signal in the noise and to make sense of it, we also began working on technologies that would protect the privacy of innocent people. I have this concept of what I call a "privacy appliance," which is a device that sits on top of a database of information. And, the appliance does several things. It accepts the query from the user, and checks.
Um. Er. Ahem. This technology already exists. In fact, it has been a standard part of database design for decades. The mechanism in question is called a view. Take any type of information in a database--say, the bank transactions of a suspect--and filter out all but the essential information that can be safely shared without violating the privacy rights of the individual in question, or the other people who send or receive money from this person. (After all, it may turn out that they're all innocent of any wrongdoing.) Give this distilled view to an FBI agent, who can then decide whether further investigation, requiring legal authorization to get further details, is required. This scrubbed, limited set of information is called a view in database parlance.
And yet, the person who was running the Total Information Awareness program apparently thought a brand-new appliance was required to do what database views have accomplished for decades. As I said, Ahem.
Comments