PostgreSQL and the future Aaron Thul
Who am I? •  Computer & Database Geek •  Formerly a Debian SysAdmin •  Presently Vice President of Technology at a EMOL •  PostgreSQL Evangelist •  Penguicon Organizer
Ground rules •  Questions are good •  Arguments are bad
Agenda •  Big News •  New Features •  State of PostgreSQL Community •  Evil •  Profit
First Things First •  There is some big news
First Things First •  There is some big news •  PostgreSQL 8.5 is…
First Things First •  There is some big news •  PostgreSQL 8.5 is… • PostgreSQL 9.0 !
Why 8.5 became 9.0 •  Built-in log-streaming replication •  Hot standby
New Features
Streaming replication •  Warm Standby 8.x to Hot Standby •  New kinds of postmaster processes, walsenders and walreceiver •  Thanks to Streaming Replication, lag should be nearly zero
DROP IF EXISTS •  Columns •  Constraints
Better messages for unique violation •  Improve unique-constraint-violation error messages to include the exact values being complained of.
Deferrable Uniqueness •  Massive unique-key reassignments
MOVE FORWARD or BACKWARD •  MOVE {FORWARD,BACKWARD} X •  More easily move around in a curser
’samehost’ and ’samenet’ •  Does anyone know what HBA stands for? •  To lazy to list all the hosts on your network •  CIDR-ADDRESS
GRANT ALL •  Grant privileges on all existing tables •  Automatically grant privileges on all tables that will be created in this database in the future
TRIGGERS on columns •  SQL-compliant triggers on columns, ie fire only if certain columns are named in the UPDATE's SET list •  Help resolve circular TRIGGER issues
Conditional TRIGGERS CREATE TRIGGER test_u AFTER UPDATE ON test FOR EACH ROW WHEN ( NEW.i <= 0 ) EXECUTE PROCEDURE test_u();
VACUUM FULL •  VACUUM FULL will start to behave like CLUSTER •  Will not actually be reordering rows •  VACUUM FULL INPLACE
Buffers info for explain explain ( analyze on, buffers on ) select count(*) from pg_attribute ; QUERY PLAN ---------------------------------------------------- ---------------------- Aggregate (cost=64.70..64.71 rows=1 width=0) (actual time=0.466..0.466 rows=1 loops=1) Buffers: shared hit=18 read=21 -> Seq Scan on pg_attribute (cost=0.00..59.56 rows=2056 width=0) (actual time=0.002..0.301 rows=2002 loops=1) Buffers: shared hit=18 read=21 Total runtime: 0.492 ms
The list goes on •  Multi-threaded pgbench •  Hinting for number of distinct values •  Machine readable EXPLAIN •  Checking password strength •  PL/pgSQL by default
PostgreSQL Community
Who makes database Software •  Oracle/SUN •  IBM •  Microsoft
PostgreSQL Community •  Community not company •  Growing •  International •  PostgreSQL Association •  Nothing happening with SUN/MySQL/Oracle is hurting the PostgreSQL community
www.postgresql.org •  downloads •  documentation •  bug reports •  security alerts •  wiki •  support companies
www.pgfoundry.org •  Modules •  Programs •  Resources
www.planetpostgresql.org •  Project News •  Community News •  Helpful Tips / Examples
archives.postgresql.org •  mailing list archives back to 1997 •  full text search via postgtresql 8.3 •  keyword search suggestions
#postgresql •  irc.freenode.net •  real time help •  rtfm_please - ??help
PostgreSQL Community Events •  PGCon •  PostgreSQL Conference 2009 Japan •  European PGDay •  PgWest •  PgEast •  PGCon Brazil •  Local PUGs •  http://wiki.postgresql.org/wiki/Events
Project Management •  Core team •  Committers •  -hackers •  Roadmap •  Web team
Threats to PostgreSQL •  Patent attacks •  Hiring of volunteers to work on unrelated projects
Evil
Data Types •  Just use text ▫  char/varchar/text the same under the hood ▫  avoid artificial limits •  Focus on functions ▫  Phone numbers often require string manipulation ▫  Unix timestamp vs. Date arithmetic •  Minimize typecasts
Take advantage of strong data typing •  CHECK limits input at column level •  ENUM limits specific values at type level ▫  Allows you to define a custom order, provides compact storage •  DOMAIN defines a data type within constraint boundaries •  Often outperforms JOIN on lookup tables •  Allows for simpler schema design
Normalization (3NF) •  All non-key columns must be directly dependent on PK ▫  Head is only dependent on the Name through the Workgroup column
Surrogate Keys •  Natural Key (NK) is a CK with a natural relationship to that row •  Surrogate Key (SK) is an artificially added unique identifier ▫  Since they are artificial they make queries harder to readand can lead to more joins •  Integers do not significantly improve JOIN performance or reduce file I/O for many data sets
Bareword ids •  Makes SQL less obvious to read ▫  SELECT id, id, id FROM foo, bar, baz WHERE … ▫  Makes ANSI JOIN syntax more cumbersome JOIN foo USING (bar_id) ▫  JOIN foo ON (foo.bar_id = bar.id) •  Often resort to alias columns to add clarity, scoping •  Some ORMs really like this (can be overridden •  Use verbose id names instead ▫  Create table actor (actor_id, full_name text);
Use standard definitions •  Often data has been designed in a standard way ▫  Country Code ▫  Email address ▫  Zip Code ▫  VIN ▫  SEX (ISO 5218) •  Helps eliminate short-sightedness •  Increases commonality across projects
Images in the database •  Replication •  Backups •  Access control •  Transactions •  OS Portability
Over indexing •  Indexes must be updated when data changes occur ▫  INSERT/UPDATE/DELETE all touch indexes ▫  Some like it HOT, pg_stat_all_tables •  BitMap vs. Multi-Column Indexes ▫  Combine index on (a) and (b) in memory ▫  Index on (x,y,z) implies index on (x) and (x,y) •  Make sure indexes are used ▫  pg_stat_all_indexes
Covering indexes •  Creating indexes to avoid accessing data in the table ▫  TOAST makes this less necessary ▫  Visibility information stored in the table
Full text indexing •  IT ROCKS! •  Add search engine style functionality to DBMS ▫  LIKE '%foo%' and LIKE '%foo' cannot use index ▫  Regex searching has similar issues ▫  Built-in tsearch functionality in 8.3+   GIN, expensive to update, very fast for searching   GIST, cheaper to update, not as fast for searching •  Database Specific Syntax
Profit
Lessons To Take Away •  PostgreSQL 9.0 is going to rock •  Never confuse a company with community •  Some mistakes are just to much fun to make only once •  Never ask for directions from a two-headed tourist! -Big Bird
Thanks to •  The PostgreSQL Community! •  Robert Treat •  Hubert Lubaczewski •  More info: ▫  http://www.depesz.com/ ▫  http://www.xzilla.net/
Questions •  Web: http://www.chasingnuts.com •  Email: aaron@chasingnuts.com •  IRC: AaronThul on irc.freenode.org •  Jabber: apthul@gmail.com •  Twitter: @AaronThul •  AIM: AaronThul

PostgreSQL 9.0 & The Future

  • 1.
    PostgreSQL and thefuture Aaron Thul
  • 2.
    Who am I? • Computer & Database Geek •  Formerly a Debian SysAdmin •  Presently Vice President of Technology at a EMOL •  PostgreSQL Evangelist •  Penguicon Organizer
  • 5.
    Ground rules •  Questionsare good •  Arguments are bad
  • 6.
    Agenda •  Big News • New Features •  State of PostgreSQL Community •  Evil •  Profit
  • 7.
    First Things First • There is some big news
  • 8.
    First Things First • There is some big news •  PostgreSQL 8.5 is…
  • 9.
    First Things First • There is some big news •  PostgreSQL 8.5 is… • PostgreSQL 9.0 !
  • 10.
    Why 8.5 became9.0 •  Built-in log-streaming replication •  Hot standby
  • 11.
  • 12.
    Streaming replication •  WarmStandby 8.x to Hot Standby •  New kinds of postmaster processes, walsenders and walreceiver •  Thanks to Streaming Replication, lag should be nearly zero
  • 13.
    DROP IF EXISTS • Columns •  Constraints
  • 14.
    Better messages forunique violation •  Improve unique-constraint-violation error messages to include the exact values being complained of.
  • 15.
    Deferrable Uniqueness •  Massiveunique-key reassignments
  • 16.
    MOVE FORWARD orBACKWARD •  MOVE {FORWARD,BACKWARD} X •  More easily move around in a curser
  • 17.
    ’samehost’ and ’samenet’ • Does anyone know what HBA stands for? •  To lazy to list all the hosts on your network •  CIDR-ADDRESS
  • 18.
    GRANT ALL •  Grantprivileges on all existing tables •  Automatically grant privileges on all tables that will be created in this database in the future
  • 19.
    TRIGGERS on columns • SQL-compliant triggers on columns, ie fire only if certain columns are named in the UPDATE's SET list •  Help resolve circular TRIGGER issues
  • 20.
    Conditional TRIGGERS CREATE TRIGGERtest_u AFTER UPDATE ON test FOR EACH ROW WHEN ( NEW.i <= 0 ) EXECUTE PROCEDURE test_u();
  • 21.
    VACUUM FULL •  VACUUMFULL will start to behave like CLUSTER •  Will not actually be reordering rows •  VACUUM FULL INPLACE
  • 22.
    Buffers info forexplain explain ( analyze on, buffers on ) select count(*) from pg_attribute ; QUERY PLAN ---------------------------------------------------- ---------------------- Aggregate (cost=64.70..64.71 rows=1 width=0) (actual time=0.466..0.466 rows=1 loops=1) Buffers: shared hit=18 read=21 -> Seq Scan on pg_attribute (cost=0.00..59.56 rows=2056 width=0) (actual time=0.002..0.301 rows=2002 loops=1) Buffers: shared hit=18 read=21 Total runtime: 0.492 ms
  • 23.
    The list goeson •  Multi-threaded pgbench •  Hinting for number of distinct values •  Machine readable EXPLAIN •  Checking password strength •  PL/pgSQL by default
  • 24.
  • 25.
    Who makes databaseSoftware •  Oracle/SUN •  IBM •  Microsoft
  • 26.
    PostgreSQL Community •  Communitynot company •  Growing •  International •  PostgreSQL Association •  Nothing happening with SUN/MySQL/Oracle is hurting the PostgreSQL community
  • 27.
    www.postgresql.org •  downloads •  documentation • bug reports •  security alerts •  wiki •  support companies
  • 28.
  • 29.
    www.planetpostgresql.org •  Project News • Community News •  Helpful Tips / Examples
  • 30.
    archives.postgresql.org •  mailing listarchives back to 1997 •  full text search via postgtresql 8.3 •  keyword search suggestions
  • 31.
    #postgresql •  irc.freenode.net •  realtime help •  rtfm_please - ??help
  • 32.
    PostgreSQL Community Events • PGCon •  PostgreSQL Conference 2009 Japan •  European PGDay •  PgWest •  PgEast •  PGCon Brazil •  Local PUGs •  http://wiki.postgresql.org/wiki/Events
  • 33.
    Project Management •  Coreteam •  Committers •  -hackers •  Roadmap •  Web team
  • 34.
    Threats to PostgreSQL • Patent attacks •  Hiring of volunteers to work on unrelated projects
  • 35.
  • 36.
    Data Types •  Justuse text ▫  char/varchar/text the same under the hood ▫  avoid artificial limits •  Focus on functions ▫  Phone numbers often require string manipulation ▫  Unix timestamp vs. Date arithmetic •  Minimize typecasts
  • 37.
    Take advantage ofstrong data typing •  CHECK limits input at column level •  ENUM limits specific values at type level ▫  Allows you to define a custom order, provides compact storage •  DOMAIN defines a data type within constraint boundaries •  Often outperforms JOIN on lookup tables •  Allows for simpler schema design
  • 38.
    Normalization (3NF) •  Allnon-key columns must be directly dependent on PK ▫  Head is only dependent on the Name through the Workgroup column
  • 39.
    Surrogate Keys •  NaturalKey (NK) is a CK with a natural relationship to that row •  Surrogate Key (SK) is an artificially added unique identifier ▫  Since they are artificial they make queries harder to readand can lead to more joins •  Integers do not significantly improve JOIN performance or reduce file I/O for many data sets
  • 40.
    Bareword ids •  MakesSQL less obvious to read ▫  SELECT id, id, id FROM foo, bar, baz WHERE … ▫  Makes ANSI JOIN syntax more cumbersome JOIN foo USING (bar_id) ▫  JOIN foo ON (foo.bar_id = bar.id) •  Often resort to alias columns to add clarity, scoping •  Some ORMs really like this (can be overridden •  Use verbose id names instead ▫  Create table actor (actor_id, full_name text);
  • 41.
    Use standard definitions • Often data has been designed in a standard way ▫  Country Code ▫  Email address ▫  Zip Code ▫  VIN ▫  SEX (ISO 5218) •  Helps eliminate short-sightedness •  Increases commonality across projects
  • 42.
    Images in thedatabase •  Replication •  Backups •  Access control •  Transactions •  OS Portability
  • 43.
    Over indexing •  Indexesmust be updated when data changes occur ▫  INSERT/UPDATE/DELETE all touch indexes ▫  Some like it HOT, pg_stat_all_tables •  BitMap vs. Multi-Column Indexes ▫  Combine index on (a) and (b) in memory ▫  Index on (x,y,z) implies index on (x) and (x,y) •  Make sure indexes are used ▫  pg_stat_all_indexes
  • 44.
    Covering indexes •  Creatingindexes to avoid accessing data in the table ▫  TOAST makes this less necessary ▫  Visibility information stored in the table
  • 45.
    Full text indexing • IT ROCKS! •  Add search engine style functionality to DBMS ▫  LIKE '%foo%' and LIKE '%foo' cannot use index ▫  Regex searching has similar issues ▫  Built-in tsearch functionality in 8.3+   GIN, expensive to update, very fast for searching   GIST, cheaper to update, not as fast for searching •  Database Specific Syntax
  • 46.
  • 47.
    Lessons To TakeAway •  PostgreSQL 9.0 is going to rock •  Never confuse a company with community •  Some mistakes are just to much fun to make only once •  Never ask for directions from a two-headed tourist! -Big Bird
  • 48.
    Thanks to •  ThePostgreSQL Community! •  Robert Treat •  Hubert Lubaczewski •  More info: ▫  http://www.depesz.com/ ▫  http://www.xzilla.net/
  • 49.
    Questions •  Web: http://www.chasingnuts.com • Email: aaron@chasingnuts.com •  IRC: AaronThul on irc.freenode.org •  Jabber: apthul@gmail.com •  Twitter: @AaronThul •  AIM: AaronThul