0

I would like to store .pdf files into a SQLite database. I know the preferred method is to store the .pdf locally on the android device and then store the path in the database but I want users to be able to download the apk and not have to worry about doing any other device setup. I have also read that in order to store .pdf in the database, it needs to be converted to bytes and then stored in a BLOB. However, I'm writing this in Kotlin and I'm truthfully not sure where to start with converting to bytes and then storing in BLOB. Can anyone provide sample code for doing this?

1 Answer 1

0

I know the preferred method is to store the .pdf locally on the android device and then store the path in the database but I want users to be able to download the apk and not have to worry about doing any other device setup.

An APK is not limited to just database files, all sorts of files can be included in the package as either assets or resources.

You may well encounter issues trying to retrieve pdf files stored in the database due to their size.

I have also read that in order to store .pdf in the database, it needs to be converted to bytes and then stored in a BLOB.

A file is a stream of bytes (demo below shows this).

As such it is still recommended that you simply store a value, such as the file name, in the database and retrieve the pdf file from a suitable location.


Demo


First there are 2 PDF's that exist wherever (just on the PC):-

enter image description here

An SQLite Tool (Navicat for SQLite in this case) was used to create a pre-populated database using the following SQL:-

DROP TABLE IF EXISTS pdf; CREATE TABLE IF NOT EXISTS pdf (id INTEGER PRIMARY KEY, pdf_title TEXT, pdf_name TEXT); INSERT INTO pdf VALUES (1,'Blah PDF','pdf001.pdf'),(2,'Not Blah PDF','pdf002.pdf'); 

resulting in:-

enter image description here

The database was closed, the connection was closed and the Navitcat for SQLite was closed. Resulting in a single file, also located on the PC:-

enter image description here

At this stage the base data exists.

A new empty Kotlin project is created in Android Studio and the assets folder is created. The 3 files (2 PDF's and the Database) are copied (the database file being renamed to mypdfstore.db):-

enter image description here

A class that extends the SQLiteOpenHelper class called DBHelper was created:-

const val DATABASE_NAME = "mypdfstore.db" const val DATABASE_VERSION = 1 const val PDF_TABLE_NAME = "pdf" const val PDF_ID_COLUMN = "id" const val PDF_TITLE_COLUMN = "pdf_title" const val PDF_NAME_COLUMN = "pdf_name" const val PDF_CREATE_TABLE_SQL = "CREATE TABLE IF NOT EXISTS $PDF_TABLE_NAME ($PDF_ID_COLUMN INTEGER PRIMARY KEY, $PDF_TITLE_COLUMN TEXT, $PDF_NAME_COLUMN TEXT);" class DBHelper(context: Context): SQLiteOpenHelper(context, DATABASE_NAME,null, DATABASE_VERSION) { override fun onCreate(p0: SQLiteDatabase?) { if (p0 != null) p0.execSQL(PDF_CREATE_TABLE_SQL) } override fun onUpgrade(p0: SQLiteDatabase?, p1: Int, p2: Int) { TODO("Not yet implemented") } companion object { private var instance: DBHelper?=null fun getInstance(context: Context): DBHelper { if (instance==null) { if(!ifDBExists(DATABASE_NAME,context)) { getDBFromAsset(DATABASE_NAME, DATABASE_NAME,context) } instance = DBHelper(context) } return instance as DBHelper } fun ifDBExists(databaseName: String, context: Context): Boolean { val db_file = context.getDatabasePath(databaseName) if (db_file.exists()) return true val db_directory = db_file.parentFile if (db_directory != null) { if (!db_directory.exists()) db_directory.mkdirs() } return false } fun getDBFromAsset(assetName: String, databaseName: String, context: Context) { val i = context.assets.open(assetName) val o = context.getDatabasePath(databaseName).outputStream() val bufferSize = 1024 * 8 val buffer = ByteArray(bufferSize) while (i.read(buffer) > 0) { o.write(buffer) } o.flush() o.close() i.close() } } fun getAllFromPDF(): Cursor { return this.writableDatabase.query(PDF_TABLE_NAME,null,null,null,null,null,null) } } 
  • The companion object deserves special attention, it
    • has a private var called instance, this caters for a single instance of the database to be passed/used wherever (as long as a Context is available)
    • a function getInstance to return the instance, instantiating just the once.
    • before instantiating the instance, a check is made to see if the database exists by calling the ifDBExists function.
      • if it does not exist then the getDBFromAsset function is called to do as it says. (note how it is read into a ByteArray (BLOB in Room))
      • if it does exist then noting is done, processing continues
    • as the database now exists then the instance variable is instantiated and returned.
  • It should be noted that even though the overridden onCreate function will create the sole table, the function should never actually be called as the database is a pre-populated/pre-existing database in the assets.

Additionally another class PDFReader has been created:-

class PDFReader() { fun doesPDFExist(pdfName: String, context: Context): Boolean { try { val pdf = context.assets.open(pdfName) val o = context.getDatabasePath(pdfName).outputStream() pdf.copyTo(o) } catch (e: Exception) { e.printStackTrace() return false } return true } } 
  • This class is just intended to demonstrate access to the PDF assets by copying them into the same folder where the database is stored (overwriting them if need be). It is not intended to display the PDFs that is outside the scope of the answer

Obviously without using the above classes nothing will be done, as such some activity code MainActivity:-

class MainActivity : AppCompatActivity() { lateinit var db: DBHelper lateinit var csr: Cursor override fun onCreate(savedInstanceState: Bundle?) { super.onCreate(savedInstanceState) setContentView(R.layout.activity_main) /* Get the database instance */ db = DBHelper.getInstance(this) /* extract the data from the database*/ csr = db.getAllFromPDF() val id_ix = csr.getColumnIndex(PDF_ID_COLUMN) val title_ix = csr.getColumnIndex(PDF_TITLE_COLUMN) val name_ix = csr.getColumnIndex(PDF_NAME_COLUMN) /* for each row copy the file from the asset to the databases folder */ while (csr.moveToNext()) { val r = PDFReader().doesPDFExist(csr.getString(name_ix),this) } csr.close() } } 

When run then using App Inspection:-

enter image description here

i.e. The mypdfstore.db database exists and have the data according to the original database created in the SQLite tool. Additionally the 2 pdf files exist in the same directory (NOTE the databases directory used for brevity of the demo).

Using Device Explorer:-

enter image description here

i.e. the files exist and the pdf files are the expected size. The database has two additional files mypdfstore.db-wal and mypdfstore.db-shm; these exist as WAL logging is used.

Additional (save as a BLOB)

Not that this additional content would be very useful (as you would want to save the PDF's into the pre-existing database rather from within the App); it does show how you can save a PDF into the database.

  • It should be noted that the PDF's are both relatively small (180k, they are, bar the name, identical)

(1) The following function is added to the DBHelper class:-

fun savePDF(pdfName: String,pdfStream: ByteArray, id: Long?=null): Long { val cv = ContentValues() if (id!=null) cv.put(PDF_ID_COLUMN,id) cv.put(PDF_NAME_COLUMN,pdfName) cv.put(PDF_TITLE_COLUMN,pdfStream) return instance!!.writableDatabase.insert(PDF_TABLE_NAME,null,cv) } 
  • note that the flexibility of SQLite allowing any type of data to be saved in any column type (an exception being a single column that has the type INTEGER PRIMARY KEY, i.e. a rowid or alias thereof) has been used. That is the PDF (BLOB) is saved in the pdf_title column.

(2) The following function has been added to the MainActivity:-

fun getPDFStream(pdfName: String): ByteArray { var rv = ByteArray(0) val pdfFile = this.getDatabasePath(pdfName) if (pdfFile.exists()) { rv = pdfFile.readBytes() } return rv } 

(3) The While loop that traverse the Cursor has the additional line:-db.savePDF("alt_${csr.getString(name_ix)}",getPDFStream(csr.getString(name_ix)))

Now when run App Inspection shows:-

enter image description here

  • i.e. the title column for the rows with ids 3 and 4 obviously contain a byte stream

Furthermore using the Query SELECT typeof(pdf_title) AS column_type, length(pdf_title) as L FROM pdf; from App Inspection; results in:-

enter image description here

  • i.e. For ids 1 and 2 the column type (actually storage class) is TEXT, but for ids 3 and 4 the type is BLOB.
    • the length of the data also clearly indicates the difference; as for 1 and 2 the length is 8 and 12 respectively, whilst for 3 and 4 the length is considerably greater (around 180k as would be expected).

This addition also indicates that storing the data as a blob, rather than as files, is also a more long winded/complex process. i.e. you would have to extract the ByteArray and then convert it for opening as a PDF (perhaps refer to PDF to byte array and vice versa)

Note the code above is not intended to be best practices, it has been written to put forward the core techniques concisely i.e. their is room for improvement.

Sign up to request clarification or add additional context in comments.

1 Comment

Mike, this is one of the most thorough answers I have received on StackOverflow. Thanks for taking the time. This is incredibly helpful.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.