PostgreSQL: Partitioned Table vs Non Partitioned Table (Part 3)

This article compares the speed and performance of queries between partitioned and non partitioned PostgreSQL tables. However, it is important to remember that the PostgreSQL tables partitioning has also another benefits, than the better performance on queries. More information about other benefits from the first part ‘Howto create PostgreSQL table partitioning (Part 1)‘.

This is comparision between partitioned and non partitioned PostgreSQL tables. The same tests were carried out with and without indices, because using the indices, it is no longer very meaningful example on table, which has one billion rows of data (if the table is not partitioned). This comparison is used an example of an existing table, which looks like this:

               TABLE "public.impressions_by_day_partitioned"
    COLUMN     |  TYPE   |              Modifiers               
---------------+---------+--------------------------------------
 advertiser_id | INTEGER | NOT NULL
 DAY           | DATE    | NOT NULL DEFAULT ('now'::text)::DATE
 impressions   | INTEGER | NOT NULL

Both partitioned and non partitioned tables are identical and tables contains one hundred million rows test data:

SELECT COUNT(*) FROM impressions_by_day_partitioned;
  COUNT   
----------
 100000000
(1 ROW)

6 Test cases

  • 1a. Select total sum of impressions on January 2009. (Day field indexed)
  • 1b. Select total sum of impressions on January 2009. (Day field not indexed)
  • 2a. Select the number of all advertisers in 2011. (Day field indexed)
  • 2b. Select the number of all advertisers in 2011. (Day field not indexed)
  • 3a. Select MAX impressions on 25 November 2010. (Day field indexed)
  • 3b. Select MAX impressions on 25 November 2010. (Day field not indexed)
  • 4a. Select Top ten advertisers on September 2009. (Day field indexed)
  • 4b. Select Top ten advertisers on September 2009. (Day field not indexed)
  • 5a. Update all impressions to 50 000 on 16 July 2010. (Day field indexed)
  • 5b. Update all impressions to 50 000 on 16 July 2010. (Day field not indexed)
  • 6a. Insert ten rows to table. (Day field indexed)
  • 6b. Insert ten rows to table. (Day field not indexed)

1a. Select total sum of impressions on January 2009. (Day field indexed)

[test]> SELECT SUM(impressions) 
FROM impressions_by_day_partitioned 
WHERE DAY >= '2009-01-01' AND DAY < '2009-02-01';
     SUM      
--------------
 151715749479
(1 ROW)
 
TIME: 4441.879 ms
[test]> SELECT SUM(impressions) 
FROM impressions_by_day_non_partitioned 
WHERE DAY >= '2009-01-01' AND DAY < '2009-02-01';
     SUM      
--------------
 151715749479
(1 ROW)
 
TIME: 12335.544 ms

1b. Select total sum of impressions on January 2009. (Day field not indexed)

[test]> SELECT SUM(impressions) 
FROM impressions_by_day_partioned 
WHERE DAY >= '2009-01-01' AND DAY < '2009-02-01';
     SUM      
--------------
 151715749479
(1 ROW)
 
TIME: 4760.410 ms
[test]> SELECT SUM(impressions) 
FROM impressions_by_day_non_partioned 
WHERE DAY >= '2009-01-01' AND DAY < '2009-02-01';
     SUM      
--------------
 151715749479
(1 ROW)
 
TIME: 30011.654 ms

2a. Select the number of all advertisers in 2011. (Day field indexed)

[test]> SELECT COUNT(DISTINCT(advertiser_id)) 
FROM impressions_by_day_partitioned 
WHERE DAY >= '2011-01-01' AND DAY < '2012-01-01';
 COUNT  
--------
 998952
(1 ROW)
 
TIME: 35965.297 ms
[test]> SELECT COUNT(DISTINCT(advertiser_id)) 
FROM impressions_by_day_non_partitioned 
WHERE DAY >= '2011-01-01' AND DAY < '2012-01-01';
 COUNT  
--------
 998952
(1 ROW)
 
TIME: 38207.764 ms

2b. Select the number of all advertisers in 2011. (Day field not indexed)

[test]> SELECT COUNT(DISTINCT(advertiser_id)) 
FROM impressions_by_day_partitioned 
WHERE DAY >= '2011-01-01' AND DAY < '2012-01-01';
 COUNT  
--------
 998952
(1 ROW)
 
TIME: 35994.232 ms
[test]> SELECT COUNT(DISTINCT(advertiser_id)) 
FROM impressions_by_day_non_partitioned 
WHERE DAY >= '2011-01-01' AND DAY < '2012-01-01';
 COUNT  
--------
 998952
(1 ROW)
 
TIME: 54922.130 ms

3a. Select MAX impressions on 25 November 2010. (Day field indexed)

[test]> SELECT MAX(impressions) FROM impressions_by_day_partitioned 
WHERE DAY = '2010-11-25';
  MAX   
--------
 100000
(1 ROW)
 
TIME: 223.741 ms
[test]> SELECT MAX(impressions) FROM impressions_by_day_non_partioned 
WHERE DAY = '2010-11-25';
  MAX   
--------
 100000
(1 ROW)
 
TIME: 795.810 ms

3b. Select MAX impressions on 25 November 2010. (Day field not indexed)

[test]> SELECT MAX(impressions) FROM impressions_by_day 
WHERE DAY = '2010-11-25';
  MAX   
--------
 100000
(1 ROW)
 
TIME: 522.345 ms
[test]> SELECT MAX(impressions) FROM impressions_by_day_test 
WHERE DAY = '2010-11-25';
  MAX   
--------
 100000
(1 ROW)
 
TIME: 27916.619 ms

4a. Select Top ten advertisers on September 2009. (Day field indexed)

[test]> SELECT advertiser_id, SUM(impressions) AS impressions 
FROM impressions_by_day_partitioned 
WHERE DAY >= '2009-09-01' AND DAY < '2009-10-01' 
GROUP BY advertiser_id ORDER BY impressions DESC LIMIT 10;
 advertiser_id | impressions 
---------------+-------------
        155772 |      424030
        121456 |      409425
         99802 |      402586
        722802 |      391690
        609813 |      387730
        812803 |      387310
        314459 |      382856
        701436 |      380741
         31012 |      378990
        712014 |      378537
(10 ROWS)
 
TIME: 1563.499 ms
[test]> SELECT advertiser_id, SUM(impressions) AS impressions 
FROM impressions_by_day_non_partitioned 
WHERE DAY >= '2009-09-01' AND DAY < '2009-10-01' 
GROUP BY advertiser_id ORDER BY impressions DESC LIMIT 10;
 advertiser_id | impressions 
---------------+-------------
        155772 |      424030
        121456 |      409425
         99802 |      402586
        722802 |      391690
        609813 |      387730
        812803 |      387310
        314459 |      382856
        701436 |      380741
         31012 |      378990
        712014 |      378537
(10 ROWS)
 
TIME: 3824.523 ms

4b. Select Top ten advertisers on September 2009. (Day field not indexed)

[test]> SELECT advertiser_id, SUM(impressions) AS impressions 
FROM impressions_by_day_partitioned 
WHERE DAY >= '2009-09-01' AND DAY < '2009-10-01' 
GROUP BY advertiser_id ORDER BY impressions DESC LIMIT 10;
 advertiser_id | impressions 
---------------+-------------
        155772 |      424030
        121456 |      409425
         99802 |      402586
        722802 |      391690
        609813 |      387730
        812803 |      387310
        314459 |      382856
        701436 |      380741
         31012 |      378990
        712014 |      378537
(10 ROWS)
 
TIME: 1592.145 ms
[test]> SELECT advertiser_id, SUM(impressions) AS impressions 
FROM impressions_by_day_non_partitioned 
WHERE DAY >= '2009-09-01' AND DAY < '2009-10-01' 
GROUP BY advertiser_id ORDER BY impressions DESC LIMIT 10;
 advertiser_id | impressions 
---------------+-------------
        155772 |      424030
        121456 |      409425
         99802 |      402586
        722802 |      391690
        609813 |      387730
        812803 |      387310
        314459 |      382856
        701436 |      380741
         31012 |      378990
        712014 |      378537
(10 ROWS)
 
TIME: 17168.329 ms

5a. Update all impressions to 50 000 on 16 July 2010. (Day field indexed)

[test]> UPDATE impressions_by_day_partitioned SET impressions = 50000 
WHERE DAY = '2010-07-16';
UPDATE 27004
TIME: 86416.192 ms
[test]> UPDATE impressions_by_day_non_partitioned SET impressions = 50000 
WHERE DAY = '2010-07-16';
UPDATE 27004
TIME: 68784.290 ms

5b. Update all impressions to 50 000 on 16 July 2010. (Day field not indexed)

UPDATE impressions_by_day_partitioned SET impressions = 50000 
WHERE DAY = '2010-07-16';
UPDATE 27004
TIME: 93843.251 ms
[test]> UPDATE impressions_by_day_non_partitioned SET impressions = 50000 
WHERE DAY = '2010-07-16';
UPDATE 27004
TIME: 317275.538 ms

6a. Insert ten rows to table. (Day field indexed)

[test]> INSERT INTO impressions_by_day_partitioned VALUES 
(1001, '2009-01-08', 75455), 
(10431, '2010-11-04', 47457),
(1241, '2011-07-04', 3457),
(5353, '2009-06-01', 43357),
(3636, '2012-09-16', 8755),
(3939, '2009-11-19', 5454),
(7766, '2010-02-21', 8944),
(44676, '2012-04-26', 24110),
(5467, '2011-08-24', 1477),
(3243, '2009-11-20', 65478);
INSERT 0 0
TIME: 892.246 ms
[test]> INSERT INTO impressions_by_day_non_partitioned VALUES 
(1001, '2009-01-08', 75455), 
(10431, '2010-11-04', 47457),
(1241, '2011-07-04', 3457),
(5353, '2009-06-01', 43357),
(3636, '2012-09-16', 8755),
(3939, '2009-11-19', 5454),
(7766, '2010-02-21', 8944),
(44676, '2012-04-26', 24110),
(5467, '2011-08-24', 1477),
(3243, '2009-11-20', 65478);
INSERT 0 10
TIME: 205.299 ms

6b. Insert ten rows to table. (Day field not indexed)

[test]> INSERT INTO impressions_by_day_partitioned VALUES 
(1001, '2009-01-08', 75455), 
(10431, '2010-11-04', 47457),
(1241, '2011-07-04', 3457),
(5353, '2009-06-01', 43357),
(3636, '2012-09-16', 8755),
(3939, '2009-11-19', 5454),
(7766, '2010-02-21', 8944),
(44676, '2012-04-26', 24110),
(5467, '2011-08-24', 1477),
(3243, '2009-11-20', 65478);
INSERT 0 0
TIME: 886.146 ms
[test]> INSERT INTO impressions_by_day_non_partitioned VALUES 
(1001, '2009-01-08', 75455), 
(10431, '2010-11-04', 47457),
(1241, '2011-07-04', 3457),
(5353, '2009-06-01', 43357),
(3636, '2012-09-16', 8755),
(3939, '2009-11-19', 5454),
(7766, '2010-02-21', 8944),
(44676, '2012-04-26', 24110),
(5467, '2011-08-24', 1477),
(3243, '2009-11-20', 65478);
INSERT 0 10
TIME: 204.894 ms

Summary

A few test queries simply shows the fact that a Select queries is faster to partitioned table than the non-partitioned table. Is easily seen by the fact that Updates and Inserts slowed little bit on partitioned table.

These differences are much clearer on Select queries, if the amount of data is much higher. At a certain point indexing is not working anymore, because index size is too big. Then the only sensible option is partitioning.

Follow If Not True Then False Updates!

9 Comments

  1. Hello,

    Thanks for writing such a great tutorial on PostgreSQL Table Partition. I enjoy reading it very much.

    I have some question in your tutorial.

    Based on http://www.if-not-true-then-false.com/2009/12/performance-testing-between-partitioned-and-non-partitioned-postgresql-tables-part-3/

    Example 3a. Select MAX impressions on 25 November 2010. (Day field indexed)

    By using

    SELECT MAX(impressions) FROM impressions_by_day_non_partioned 
    WHERE DAY = '2010-11-25';

    How does PostgreSQL smart enough to know he need to look for table

    CREATE TABLE impressions_by_day_y2010m11ms2 (
    	PRIMARY KEY (advertiser_id, DAY), 
    	CHECK ( DAY &gt;= DATE '2010-11-01' AND DAY = DATE '2010-11-01' AND DAY &lt; DATE &#039;2011-01-01&#039;) ?

    Thanks!

    • Hi,

      Thank you for your comment. Glad to hear that my writings are enjoyable to read.

      And in response to your question:
      Apparently, you mean the 3a query to table “impressions_by_day_partitioned”, not to table “impressions_by_day_non_partioned”?

      The first part (Howto create PostgreSQL table partitioning (Part 1)) in section five, I tell about constraint_exclusion. Constraint_exclusion is a query optimization technique that improves performance for partitioned tables. I can show you a example:

      SET constraint_exclusion = off;
       
      EXPLAIN ANALYZE SELECT MAX(impressions) FROM impressions_by_day_partitioned WHERE DAY = '2010-11-25';
       
                                                                                                  QUERY PLAN                                                                                            
      --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
       Aggregate  (cost=372389.40..372389.41 ROWS=1 width=4) (actual TIME=10609.647..10609.649 ROWS=1 loops=1)
         ->  Append  (cost=18.80..372321.82 ROWS=27028 width=4) (actual TIME=7352.868..10563.892 ROWS=27143 loops=1)
               ->  Bitmap Heap Scan ON impressions_by_day_partitioned  (cost=18.80..29.36 ROWS=10 width=4) (actual TIME=0.029..0.029 ROWS=0 loops=1)
                     Recheck Cond: (DAY = '2010-11-25'::DATE)
                     ->  Bitmap INDEX Scan ON impressions_by_day_partitioned_pkey  (cost=0.00..18.80 ROWS=10 width=0) (actual TIME=0.019..0.019 ROWS=0 loops=1)
                           INDEX Cond: (DAY = '2010-11-25'::DATE)
               ->  INDEX Scan USING impressions_by_day_partitioned_y2009m1ms2_day ON impressions_by_day_partitioned_y2009m3ms2 impressions_by_day_partitioned  (cost=0.00..8.75 ROWS=1 width=4) (actual TIME=0.034..0.034 ROWS=0 loops=1)
                     INDEX Cond: (DAY = '2010-11-25'::DATE)
               ->  INDEX Scan USING impressions_by_day_partitioned_y2009m3ms2_day ON impressions_by_day_partitioned_y2009m3ms2 impressions_by_day_partitioned  (cost=0.00..8.75 ROWS=1 width=4) (actual TIME=0.034..0.034 ROWS=0 loops=1)
                     INDEX Cond: (DAY = '2010-11-25'::DATE)
               ->  INDEX Scan USING impressions_by_day_partitioned_y2009m5ms2_day ON impressions_by_day_partitioned_y2009m5ms2 impressions_by_day_partitioned  (cost=0.00..8.75 ROWS=1 width=4) (actual TIME=0.016..0.016 ROWS=0 loops=1)
                     INDEX Cond: (DAY = '2010-11-25'::DATE)
               ->  INDEX Scan USING impressions_by_day_partitioned_y2009m7ms2_day ON impressions_by_day_partitioned_y2009m7ms2 impressions_by_day_partitioned  (cost=0.00..8.75 ROWS=1 width=4) (actual TIME=0.015..0.015 ROWS=0 loops=1)
                     INDEX Cond: (DAY = '2010-11-25'::DATE)
      [...]
               ->  INDEX Scan USING impressions_by_day_partitioned_y2011m1ms2_day ON impressions_by_day_partitioned_y2011m1ms2 impressions_by_day_partitioned  (cost=0.00..8.46 ROWS=1 width=4) (actual TIME=0.017..0.017 ROWS=0 loops=1)
                     INDEX Cond: (DAY = '2010-11-25'::DATE)
               ->  INDEX Scan USING impressions_by_day_partitioned_y2011m3ms2_day ON impressions_by_day_partitioned_y2011m3ms2 impressions_by_day_partitioned  (cost=0.00..8.47 ROWS=1 width=4) (actual TIME=0.008..0.008 ROWS=0 loops=1)
                     INDEX Cond: (DAY = '2010-11-25'::DATE)
               ->  INDEX Scan USING impressions_by_day_partitioned_y2011m5ms2_day ON impressions_by_day_partitioned_y2011m5ms2 impressions_by_day_partitioned  (cost=0.00..8.46 ROWS=1 width=4) (actual TIME=0.007..0.007 ROWS=0 loops=1)
                     INDEX Cond: (DAY = '2010-11-25'::DATE)
               ->  INDEX Scan USING impressions_by_day_partitioned_y2011m7ms2_day ON impressions_by_day_partitioned_y2011m7ms2 impressions_by_day_partitioned  (cost=0.00..8.46 ROWS=1 width=4) (actual TIME=0.007..0.007 ROWS=0 loops=1)
                     INDEX Cond: (DAY = '2010-11-25'::DATE)
               ->  INDEX Scan USING impressions_by_day_partitioned_y2011m9ms2_day ON impressions_by_day_partitioned_y2011m9ms2 impressions_by_day_partitioned  (cost=0.00..8.45 ROWS=1 width=4) (actual TIME=0.007..0.007 ROWS=0 loops=1)
                     INDEX Cond: (DAY = '2010-11-25'::DATE)
               ->  INDEX Scan USING impressions_by_day_partitioned_y2011m11ms2_day ON impressions_by_day_partitioned_y2011m11ms2 impressions_by_day_partitioned  (cost=0.00..8.46 ROWS=1 width=4) (actual TIME=0.008..0.008 ROWS=0 loops=1)
                     INDEX Cond: (DAY = '2010-11-25'::DATE)
      [...]

      So as you can see it loops thru all tables if constraint exclusion is off, and then query is much slower than query to single table. With constraint exclusion enabled, the PostgreSQL query planner will examine the constraints of each partition and try to prove that the partition need not be scanned because it could not contain any rows meeting the query’s WHERE clause. When the planner can prove this, it excludes the partition from the query plan, as following:

      SET constraint_exclusion = ON;
       
      EXPLAIN ANALYZE SELECT MAX(impressions) FROM impressions_by_day_partitioned WHERE DAY = '2010-11-25';
                                                                                   QUERY PLAN                                                                              
      ---------------------------------------------------------------------------------------------------------------------------------------------------------------------
       Aggregate  (cost=29980.57..29980.58 ROWS=1 width=4) (actual TIME=699.152..699.154 ROWS=1 loops=1)
         ->  Append  (cost=18.80..29913.06 ROWS=27005 width=4) (actual TIME=0.173..653.400 ROWS=27143 loops=1)
               ->  Bitmap Heap Scan ON impressions_by_day_partitioned  (cost=18.80..29.36 ROWS=10 width=4) (actual TIME=0.013..0.013 ROWS=0 loops=1)
                     Recheck Cond: (DAY = '2010-11-25'::DATE)
                     ->  Bitmap INDEX Scan ON impressions_by_day_partitioned_pkey  (cost=0.00..18.80 ROWS=10 width=0) (actual TIME=0.008..0.008 ROWS=0 loops=1)
                           INDEX Cond: (DAY = '2010-11-25'::DATE)
               ->  INDEX Scan USING impressions_by_day_partitioned_y2010m11ms2 ON impressions_by_day_partitioned_y2010m11ms2 impressions_by_day_partitioned  (cost=0.00..8.75 ROWS=1 width=4) (actual TIME=0.015..0.015 ROWS=0 loops=1)
                     INDEX Cond: (DAY = '2010-11-25'::DATE)

      And now same query to partitioned table works lightning fast compared to same query to single table.

      And after this long example, the simple answer to your question is that: this logic (exclude not needed tables) is built into PostgreSQL Query Planner.

  2. Thank you very much for your explanation.

    I am now following your tutorial, to implement table partition in one of my application.

    As my table rows grow into billions (very soon), I can feel that the query performance will some how drop.

    Table partition seems to be the answer of my problem.

    However, during implementation, I face some problem on it which I had mentioned in PostgreSQL general mailing list. I was wondering whether you are free enough to take a look, and input me some suggestion?

    http://archives.postgresql.org/pgsql-general/2010-01/msg01184.php

    Sorry if this is not the place to ask for your advice. But I cannot locate any of your contact.

    Thank you very much.

    • Sounds like so that the partitioning is really a good solution in your situation, when the rows are growing and growing.

      This is good place to ask advice for this problem, if I understand your problem rightly then my answer is very simple.

      You have following in your trigger:

      [...]
          -- NOTICE:  IN unit_insert_trigger, table is unit_0
          -- NOTICE:  IN unit_insert_trigger, NEW.unit_id is 28
          raise notice 'IN unit_insert_trigger, table is %', unit_table_name;
          raise notice 'IN unit_insert_trigger, NEW.unit_id is %', NEW.unit_id;
       
          EXECUTE 'INSERT INTO ' || quote_ident(unit_table_name) || 
          '(unit_id, fk_lot_id, cycle) VALUES (' || 
          NEW.unit_id || ',' || NEW.fk_lot_id || ',' || NEW.cycle || ')';
       
          RETURN NULL;
      END;

      Change that RETURN NULL; to RETURN NEW; and then RETURNING * should work as expected.

      So fixed version should look like that:

      [...]
          -- NOTICE:  IN unit_insert_trigger, table is unit_0
          -- NOTICE:  IN unit_insert_trigger, NEW.unit_id is 28
          raise notice 'IN unit_insert_trigger, table is %', unit_table_name;
          raise notice 'IN unit_insert_trigger, NEW.unit_id is %', NEW.unit_id;
       
          EXECUTE 'INSERT INTO ' || quote_ident(unit_table_name) || 
          '(unit_id, fk_lot_id, cycle) VALUES (' || 
          NEW.unit_id || ',' || NEW.fk_lot_id || ',' || NEW.cycle || ')';
       
          RETURN NEW;
      END;

      Did this solve your problem?

      And true, here is not my contact details, but I would shortly add a contact form :)

  3. But that will create additional problem. A duplicated entry will be created at parent table. To demo the problem :

    (1) create database named “tutorial_partition”

    (2) perform the following SQL query :

    CREATE TABLE impressions_by_day (
    advertiser_id SERIAL NOT NULL,
    day DATE NOT NULL DEFAULT CURRENT_DATE,
    impressions INTEGER NOT NULL,
    PRIMARY KEY (advertiser_id, day)
    );

    CREATE OR REPLACE FUNCTION insert_table()
    RETURNS void AS
    $BODY$DECLARE
    _impressions_by_day impressions_by_day;
    BEGIN
    INSERT INTO impressions_by_day(impressions ) VALUES(888) RETURNING * INTO _impressions_by_day;

    RAISE NOTICE ‘After insert, the returned advertiser_id is %’, _impressions_by_day.advertiser_id;
    END;$BODY$
    LANGUAGE ‘plpgsql’ VOLATILE;
    ALTER FUNCTION insert_table() OWNER TO postgres;

    CREATE TABLE impressions_by_day_y2010m1ms2 (
    PRIMARY KEY (advertiser_id, day),
    CHECK ( day >= DATE ’2010-01-01′ AND day = DATE ’2010-01-01′ AND NEW.day < DATE '2010-03-01' ) THEN
    INSERT INTO impressions_by_day_y2010m1ms2 VALUES (NEW.*);
    ELSE
    RAISE EXCEPTION 'Date out of range. Something wrong with the impressions_by_day_insert_trigger() function!';
    END IF;
    RETURN NEW;
    END;
    $$
    LANGUAGE plpgsql;

    CREATE TRIGGER insert_impressions_by_day_trigger
    BEFORE INSERT ON impressions_by_day
    FOR EACH ROW EXECUTE PROCEDURE impressions_by_day_insert_trigger();

    (3) execute

    SELECT * FROM insert_table()

    If you impressions_by_day_y2010m1ms2, everything looks fine.

    However, if you exam impressions_by_day, it will contains duplicated item.

    http://sites.google.com/site/yanchengcheok/Home/returningnew.PNG?attredirects=0

    Any idea?

    Thanks!

  4. By the way, do you think it is possible to solve by

    CREATE RULE

    ?

    Not sure, cause I am not sure use CREATE RULE instead of TRIGGER, is the way to implement table partition.

    • Actually yes, that RETURN NEW cause new problems, because if the function returns NEW and not NULL value, then insert to table is run normally.

      I also tried that CREATE RULE method, but it’s not working too, because it’s not possible write following RULES:

      CREATE RULE impressions_by_day_insert_y2010m1ms2 AS
      ON INSERT TO impressions_by_day WHERE
          ( DAY >= DATE '2010-01-01' AND DAY < DATE '2010-03-01' )
      DO INSTEAD
          INSERT INTO impressions_by_day__y2010m1ms2 VALUES (NEW.advertiser_id,NEW.DAY,NEW.impressions) RETURNING *;

      Because RETURNING on rule is not allowed, if there is some WHERE condition.

      Error message is following:

      [test]> INSERT INTO impressions_by_day VALUES (10,CURRENT_DATE,111) returning *;
      ERROR:  cannot perform INSERT RETURNING ON relation "impressions_by_day"
      HINT:  You need an unconditional ON INSERT DO INSTEAD rule WITH a RETURNING clause.

      Therefore it seems that PostgreSQL does not support this feature at all, but I wrote you a little (dirty) hack, which could get this to work. ;)

      So I think following should work:

      CREATE TABLE impressions_by_day (
      	advertiser_id SERIAL NOT NULL,
      	DAY DATE NOT NULL DEFAULT CURRENT_DATE,
      	impressions INTEGER NOT NULL,
              PRIMARY KEY (advertiser_id, DAY)
      );
       
      CREATE TABLE impressions_by_day_temp (
      	advertiser_id SERIAL NOT NULL,
      	DAY DATE NOT NULL DEFAULT CURRENT_DATE,
      	impressions INTEGER NOT NULL,
      	PRIMARY KEY (advertiser_id, DAY)
      );
       
      CREATE OR REPLACE FUNCTION insert_table()
        RETURNS void AS
      $BODY$DECLARE
          _impressions_by_day impressions_by_day;
      BEGIN
          INSERT INTO impressions_by_day(impressions ) VALUES(888) RETURNING  * INTO _impressions_by_day;
       
          RAISE NOTICE 'After insert, the returned advertiser_id is %', _impressions_by_day.advertiser_id;
      END;$BODY$
        LANGUAGE 'plpgsql' VOLATILE;
      ALTER FUNCTION insert_table() OWNER TO postgres;
       
      CREATE TABLE impressions_by_day_y2010m1ms2 (
      	PRIMARY KEY (advertiser_id, DAY), 
      	CHECK ( DAY >= DATE '2010-01-01' AND DAY < DATE '2010-03-01' )
      ) INHERITS (impressions_by_day);
       
      CREATE INDEX impressions_by_day_y2010m1ms2_index ON impressions_by_day_y2010m1ms2 (DAY);
       
      CREATE OR REPLACE RULE impressions_by_day_insert AS
      ON INSERT TO impressions_by_day 
      DO INSTEAD INSERT INTO impressions_by_day_temp VALUES (NEW.*) RETURNING *;
       
      CREATE OR REPLACE FUNCTION impressions_by_day_insert_trigger()
      RETURNS TRIGGER AS $$
      BEGIN
      	IF ( NEW.DAY >= DATE '2010-01-01' AND NEW.DAY < DATE '2010-03-01' ) THEN 
      		INSERT INTO impressions_by_day_y2010m1ms2 VALUES (NEW.*); 
      	ELSE
      		RAISE EXCEPTION 'Date out of range.  Something wrong with the impressions_by_day_insert_trigger() function!';
      	END IF;
      	DELETE FROM impressions_by_day_temp;
      	RETURN NEW;
      END;
      $$
      LANGUAGE plpgsql;
       
      CREATE TRIGGER insert_impressions_by_day_trigger 
      	BEFORE INSERT ON impressions_by_day_temp  
      	FOR EACH ROW EXECUTE PROCEDURE impressions_by_day_insert_trigger();

      So this work little differently than normal PostgreSQL partitioning: main table has rule, which inserts to temp table. Then trigger function inserts that to right partition.

      Maybe this works for you?

  5. Thanks.

    Instead applying your suggestion, I use a more straight forward approach.

    Instead of calling INSERT… into unit, and hoping trigger function will handle the right job.

    I will call insert directly on the child table.

    — There is reason behind why we do not want to use trigger technique for table unit.
    — Please refer to : http://archives.postgresql.org/pgsql-general/2010-01/msg01184.php
    — INSERT INTO unit(fk_lot_id, cycle)
    — VALUES(_lotID, _cycle) RETURNING unit_id INTO _unit_id;
    unit_table_index = _lotID;
    unit_table_name = ‘unit_’ || _lotID;

    IF NOT EXISTS(SELECT * FROM information_schema.tables WHERE table_name = unit_table_name) THEN
    EXECUTE ‘CREATE TABLE ‘ || quote_ident(unit_table_name) || ‘
    (
    unit_id bigserial NOT NULL,
    fk_lot_id bigint NOT NULL,
    CHECK (fk_lot_id = ‘ || (unit_table_index) || ‘),
    CONSTRAINT pk_unit_’ || unit_table_index || ‘_id PRIMARY KEY (unit_id),
    CONSTRAINT fk_lot_’ || unit_table_index || ‘_id FOREIGN KEY (fk_lot_id) REFERENCES lot (lot_id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE CASCADE
    ) INHERITS (unit);’;

    EXECUTE ‘CREATE INDEX fk_lot_’ || unit_table_index || ‘_id_idx ON ‘ || quote_ident(unit_table_name) || ‘(fk_lot_id);’;
    END IF;

    EXECUTE ‘INSERT INTO ‘ || quote_ident(unit_table_name) || ‘(fk_lot_id, cycle) VALUES (‘ || _lotID || ‘,’ || _cycle || ‘) RETURNING unit_id’
    INTO _unit_id;

    _unit.unit_id = _unit_id;
    _unit.fk_lot_id = _lotID;
    _unit.cycle = _cycle;

    • Really nice to hear that you got the problem resolved and got it to work, just as you wanted. :)

      Inserting directly on the child table is very good and robust solution in this case.

Leave a Comment

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>