Scouring The DEEP WEB: U.S. Builds ‘Memex’ Search Engine To Catch Cyber Criminals – Lurking In The Web’s Deepest Recesses.
Victoria Woollaston, writing in the February 9, 2015 edition of London’s TheDailyMailOnline, describes “the deep web as a hive of illegal activity, rife with child pornography, drug deals, and human trafficking. But, it is “buried” so deep it is considered out of reach of mainstream search engines…and, of many law enforcement agencies — till now.”
Ms. Woollaston notes that “the Defense Advanced Research Projects Agency (DARPA), has developed an engine dubbed — ‘Memex,’ — a combination of memory and index — that not only scours content on this so-called dark net; but, also identifies subtle patterns in activity. Memex was announced by DARPA last year; and, the agency recently gave Scientific American a preview of the software.” DARPA added that “Today’s web searches use a centralized, ‘one size fits all approach, that searches the Internet with the same set of tools for all queries. But, common search practices miss information on the deep web — the parts of the web not indexed by standard commercial search engines — and ignore shared content across pages.” “The Dark Net consists of a network of encoded websites, that sit behind the publicly available websites, and cannot be found with normal search engines. It came to prominence,” she writes, “in 2012, when the FBI made a series of raids on Silk Road, an online marketplace described as the ‘Ebay for illegal purchases,” — mainly drugs, but also many other forms of illegal activities.
“Hidden capabilities that let the users email and host file storage through encrypted and anonymous networks are provided by the services through The Onion Router (Tor).
“Memex was designed to overcome these challenges, by extending ‘the reach of current search capabilities, and quickly and thoroughly organize subsets of information — based on individual interests. It looks behind standard search results for patterns links, and similar behaviors. Memes scours all aspects of the Deep Web — including those hidden in the Dark Net, — to create data maps that might reveal clues about illegal activity,” Ms. Woolaston writes.
“In particular,DARPA wants to use Memex to uncover human trafficking rings, by searching for patterns in the number of online sex adverts — being posted from certain regions; or, porn sites featuring the same email addresses or phone numbers. These patterns could reveal links that human investigators could miss,” explained Scientific American. “We’re envisioning a new paradigm for search that would tailor indexed content, search results, and interface tools to individual users, and specific subject areas; and, not the other way around,” said Chris White, DARPA Program Manager. “By inventing better methods for interacting with, and sharing information, we want to improve search for everyone; and, individualize access to the information. Ease of use for non programmers is essential,” he added.
“The Memex Program gets its name from a hypothetical device described in, ‘As We May Think,’ a 1945 article for The Atlantic Monthly. It was written by Vannevar Bush, Director of the U.S. Office of Scientific Research and Development (OSRD) during WWII. In the article, Memex was described as an analog computer that would supplement human memory. It would store, and automatically cross-reference all of the users books, records and other information. The cross-referencing, which Mr. Bush called ‘associative indexing,’ would let users quickly search large amounts of information, and gain insights from it,” Ms. Woollaston noted.
Not surprisingly, Ms. Woollaston writes that “the U.K. also has a Deep Web targeting initiative,” and I suspect China, Russia and others are also in this game. Last December, “the U.K. government said that a specialist unit was being set up to hunt down pedophiles, using the Dark Web to share child pornography.” Additionally, she adds, Britain’s “National Crime Agency and the Government Communications Headquarters (GCHQ), will use advances in analyzing images and communications, ‘to trace the digital footprints,’ left by the users who share them.” Prime Minister David Cameron said at the time, that “the new unit is aimed at ‘shinning a light on the web;s darkest corners,” as he announced a package of measures designed to tackle online child abuse.”
Obviously, this kind of technology — as it matures — will likely be beneficial to a number of industries and professions, from Wall Street, to law enforcement, the Intelligence Community, the medical profession, and numerous others. I do worry about its potential abuse, as well as analysts becoming lazy, or too dependent on this kind of sophisticated search analysis — and, not maintaining an insatiable, and skeptical outlook about what this kind of algorithmic data-mining can and cannot do. I suspect that the outcome is dependent on the amount of good information that the software has access to — and, that one should not consider this approach to be the holy grail for solving hard problems in a short amount of time. It may well do that in most cases, and will be tremendously beneficial. But, pattern recognition isn’t perfect; and, a clever adversary will employ denial and deception, to obfuscate, mask, and otherwise portray a false ‘picture’ of what this approach may reveal. It isn’t a panacea, and intelligence analysts, senior policy makers, Wall Street, and others — must maintain a healthy dose of skepticism, and use this software as a potential force multiplier; but, not as a substitute for critical thinking, and insatiable curiosity.
Finally, who are the leaders in this field? America, Britain, China, Russia, Israel, etc.? What are our peers, near-peers, and adversaries doing in this area — why, and to what end? V/R, RCP