My goal was to create a process for importing data into Hive using Sqoop 1.4.6. It needs to be simple (or easily automated) and use a robust file format.
Importing data from a Relational Database into Hive should be easy. It’s not. When you’re first getting started there are a lot of snags along the way including configuration issues, missing jar files, formatting problems, schema issues, data type conversion, and … the list goes on. This post shines some light on a way to use command line tools to import data as Avro files and create Hive schemas plus solutions for some of the problems along the way.