While web companies are building massive-scale data warehouses on top of Hadoop and analyzing data in every manner under the sun, Lockheed Martin is trying to help its government systems embrace a new world without breaking their mainframes.
Programs such as Social Security and food stamps still run on mainframes and COBOL, others implemented enterprise data warehouses in the 1990s that are reaching their scalability limits, and none of it is going anywhere. In some cases, particularly for programs and applications that can’t go offline, the process is like changing the engine of the train while the the train is still running.
As the next step, data-preparation software is coming from a startup called Trifacta. That company, like a handful of other startups including Paxata and Tamr, is using machine learning and a relatively streamlined user experience to simplify the process of transforming data from its raw form into something that analytic software or applications can actually use. They’re in the same vain as legacy data-integration or ETL tools used to do, only a lot easier and designed with big data stores like Hadoop in mind.