PostgreSQL 9.4 and Beyond JSON, Analytics, and More Uptime Technologies Satoshi Nagayasu @snaga FOSSASIA 2015
Satoshi Nagayasu • Satoshi Nagayasu – Database enthusiast. DBA and Data Steward. – Traveling Asia: Hong Kong, Shenzhen, Beijing, Singapore • Uptime Technologies – Co-founder – Providing consulting services around Database and Platform Technologies. • PostgreSQL – pgstatindex, pageinspect, xlogdump – PostgresForest, Postgres-XC (clusters) – Organizing Japanese Users Group.
What Iʼm doing on PostgreSQL • Postgres Toolkit – Brand new PostgreSQL DBA tool – Stay informed at uptime.jp/go/pt • Postgres Add-on for Hinemos – One of the most popular system management tools in Japan. – Monitoring, Alerting, Job Management, etc.
PostgreSQL and Hinemos Number of sessions Database Size Cache Hit Ratio Number of Written Blocks
Hacking Hardware RaspberryPi 2 & DE0 (FPGA) ZigBee (wireless)
Thanks to... • Magnus Hagander • Michael Paquier • Toshi Harada • Noriyoshi Shinoda • ... and many pg guys!
Agenda • 9.4 Overview • NoSQL (JSON and GIN Index) • Analytics (Aggregation & Mat.View) • Replication and Beyond (Logical Decoding) • Administration (ALTER SYSTEM) • Infrastructure (For Parallelization) • Beyond 9.4
9.4 Overview
9.4 Overview - Status • The first official release. – 9.4 released on 18th December. • The latest stable release – 9.4.1 released on 5th February.
9.4 Overview - Statistics • 9.4.0 - compared to 9.3.5 – 3,750 files changed. – 62,960 insertions (+) – 15,935 deletions (-)
9.4 Overview - Changes
Server Indexes General Performance Monitoring SSL Server Settings Replication and Recovery Logical Decoding Queries Utility Commands EXPLAIN Views Object Manipulation Data Types JSON Functions System Information Functions Aggregates Server‐Side Languages PL/pgSQL Server‐Side Language libpq Client Applications psql Backslash Commands pg_dump pg_basebackup Source Code Additional Modules pgbench pg_stat_statements 9.4 Overview - Changes
Categories of Enhancements • NoSQL (JSON and GIN Index) • Analytics (Aggregation & Mat.View) • Replication+ (Logical Decoding) • Administration (ALTER SYSTEM) • Basic Infrastructure (Parallelization)
NoSQL (JSON and GIN Index)
NoSQL - JSONB • JSON vs. JSONB
NoSQL - JSONB • “Binary JSON” – Different from JSON, a text representation – Faster for searching • With JSONB... – No duplicated keys allowed. Last wins. – Key order not preserved. – Can take advantages of GIN Index.
NoSQL - GIN Index • JSON+btree vs. JSONB+GIN – Btree indexes vs. GIN index http://www.slideshare.net/toshiharada/jpug-studyjsonbdatatype20141011-40103981 Table Index Size Comparison
Analytics (Aggregation & Materialized View)
Analytics - Aggregation • FILTER replaces CASE WHEN.
Analytics - Aggregation • New Aggregate Functions – percentile_cont() – percentile_disc() – mode() – rank() – dense_rank() – percent_rank() – cume_dist()
Analytics - Aggregation • Ordered-set aggregates – mode(), most common value in a subset
Analytics - Aggregation • Ordered-set aggregates – rank(), rank of a value in a subset
Analytics – Materialized Views • REFRESH MATERIALIZED VIEW CONCURRENTLY myview • Refreshing a MV concurrently (in background) without exclusive lock. • Usability and availability improved.
Replication and Beyond (Logical Decoding)
Replication and Beyond – Logical Decoding • “Logical” representation from replication stream – INSERT/UPDATE/DELETE operations – Can be replayed on different version/platform • pg_recvlogical command – Shows how it works • Replication can be more flexible – BDR (Bi-Directional Rep.), Slony, and more ... – Continuous Backup as well
pg_recvlogical (contrib)
Administration (ALTER SYSTEM)
Administration - ALTER SYSTEM • ALTER SYSTEM SET – puts new value in postgresql.auto.conf – pg_reload_conf() reloads them. – postgresql.auto.conf takes priority over postgresql.conf. • ALTER SYSTEM RESET – Remove values from postgresql.auto.conf.
Infrastructure (For Parallelization)
Dynamic Background Workers • In 9.3, background workers must start at the postmaster startup. • After 9.4, they can be launched “on-demand” basis. • From parallelization point of view... – It allows to launch multiple background processes to execute child queries in parallel.
Dynamic Shared Memory • Shared memory can be allocated “on-demand” basis – Cf.) by background workers • Main segment (ex. shared_buffers) still fixed at startup • Also supports lightweight message queue • From parallelization point of view... – It allows to share data and communicate with several bgworker processes.
My Tiny Favorite (pl/pgsql stacktrace)
pl/pgsql stacktrace http://h50146.www5.hp.com/services/ci/opensource/pdfs/PostgreSQL_9_4%20_Ver_1_0.pdf
There are many other enhancements, so please try it asap.
Beyond 9.4
BRIN Index • Block Range INdex – Holds "summary“ data, instead of raw data. – Reduces index size tremendously. – Also reduces creation/maintenance cost. – Needs extra tuple fetch to get the exact record. 0 50,000 100,000 150,000 200,000 250,000 300,000 Btree BRIN Elapsed time (ms) Index Creation 0 50,000 100,000 150,000 200,000 250,000 300,000 Btree BRIN Number of Blocks Index Size 0 2 4 6 8 10 12 14 16 18 Btree BRIN Elapsed time (ms) Select 1 record https://gist.github.com/snaga/82173bd49749ccf0fa6c
Commitfest 2015-2 CommitFest is a process to review, fix and commit the submitted patches. • Parallel Seq Scan • INSERT ... ON CONFLICT {UPDATE | IGNORE} • File level incremental backup • and others.. Still work in progress... commitfest.postgresql.org
Wrap-up • One of the most developer-friendly RDBMSes in the world. • Analytics features and the performance are improving. • Things are going to parallel.
Resources • www.postgresql.org • www.planetpostgresql.org • www.pgcon.org
Any Question?
Thank you! • E-mail: snaga@uptime.jp • Twitter, Github: @snaga • WeChat: satoshinagayasu

PostgreSQL 9.4 and Beyond @ FOSSASIA 2015 Singapore