I have a 7 TB table in a PostgreSQL 12.3 database. I want to create a "RANGE type" partition on this column. The other thing to note is that this year column is a text field rather than an integer field. There was a special cases that facilitated this decision a decade ago. You will see by the end my confusion about what kind of partition does this end up being, and how can I finish the implementation?
The table is real_estate_sales. The year is the year the home was built. I propose to partition the data in a way that will provide nearly equal sizes of partitions. I realize there are no indexes here, I have left them out for the sake of brevity.
- Create the partitioned table. This is the structure of the large home_sales table.
CREATE TABLE home_sales_partitioned (
"id" int8 NOT NULL DEFAULT nextval('home_sales_id_seq'::regclass),
address1 varchar NOT NULL,
address2 varchar,
postal_code varchar NOT NULL,
vehicle_digest varchar NOT NULL,
year_built varchar NOT NULL
) PARTITION BY RANGE (year);
That is what I want to do. But I found out I couldn't once I tried this:
- Create the integer partitions
CREATE TABLE home_sales_1900_30s (CHECK (year ~ E'^\\d+$' AND year::integer >= 1900 AND year::integer < 1940)) INHERITS (home_sales_partitioned);
But I learned that if you are inheriting from a partitioned table, you have to make that table partitioned to. So I changed it to:
- Create the partitioned table
CREATE TABLE home_sales_partitioned (
"id" int8 NOT NULL DEFAULT nextval('home_sales_id_seq'::regclass),
address1 varchar NOT NULL,
address2 varchar,
postal_code varchar NOT NULL,
vehicle_digest varchar NOT NULL,
year varchar NOT NULL
);
And left off the PARTITION BY statement since our only goal is to create the partition framework first. Then I ran:
Then these statements worked:
* Create the integer partitions
CREATE TABLE home_sales_1900_30s (CHECK (year ~ E'^\\d+$' AND year::integer >= 1900 AND year::integer < 1940)) INHERITS (home_sales_partitioned);
CREATE TABLE home_sales_1940_60s (CHECK (year ~ E'^\\d+$' AND year::integer >= 1940 AND year::integer < 1970)) INHERITS (home_sales_partitioned);
CREATE TABLE home_sales_1970s (CHECK (year ~ E'^\\d+$' AND year::integer >= 1970 AND year::integer < 1980)) INHERITS (home_sales_partitioned);
CREATE TABLE home_sales_1980_84 (CHECK (year ~ E'^\\d+$' AND year::integer >= 1980 AND year::integer < 1985)) INHERITS (home_sales_partitioned);
CREATE TABLE home_sales_1985_89 (CHECK (year ~ E'^\\d+$' AND year::integer >= 1985 AND year::integer < 1990)) INHERITS (home_sales_partitioned);
CREATE TABLE home_sales_1990_94 (CHECK (year ~ E'^\\d+$' AND year::integer >= 1990 AND year::integer < 1995)) INHERITS (home_sales_partitioned);
CREATE TABLE home_sales_2000_04 (CHECK (year ~ E'^\\d+$' AND year::integer >= 2000 AND year::integer < 2005)) INHERITS (home_sales_partitioned);
CREATE TABLE home_sales_2005_xx (CHECK (year ~ E'^\\d+$' AND year::integer >= 2005)) INHERITS (home_sales_partitioned);
Those all created successfully. Then I created a function to redirect the data on insert:
* Create the partition function
CREATE OR REPLACE FUNCTION year_range_partition(year varchar)
RETURNS text AS $$
BEGIN
IF year ~ E'^\\d+$' AND length(year) = 4 AND year::integer >= 1900 AND year::integer < 1940 THEN
RETURN 'home_sales_1900_30s';
ELSIF year ~ E'^\\d+$' AND length(year) = 4 AND year::integer >= 1940 AND year::integer < 1960 THEN
RETURN 'home_sales_1940_60s';
ELSIF year ~ E'^\\d+$' AND length(year) = 4 AND year::integer >= 1960 AND year::integer < 1970 THEN
RETURN 'home_sales_1960s';
ELSIF year ~ E'^\\d+$' AND length(year) = 4 AND year::integer >= 1970 AND year::integer < 1980 THEN
RETURN 'home_sales_1970s';
ELSIF year ~ E'^\\d+$' AND length(year) = 4 AND year::integer >= 1980 AND year::integer < 1985 THEN
RETURN 'home_sales_1980_84';
ELSIF year ~ E'^\\d+$' AND length(year) = 4 AND year::integer >= 1985 AND year::integer < 1990 THEN
RETURN 'home_sales_1985_89';
ELSIF year ~ E'^\\d+$' AND length(year) = 4 AND year::integer >= 1990 AND year::integer < 1995 THEN
RETURN 'home_sales_1990_94';
ELSIF year ~ E'^\\d+$' AND length(year) = 4 AND year::integer >= 1995 AND year::integer < 2000 THEN
RETURN 'home_sales_2000_04';
ELSIF year ~ E'^\\d+$' AND length(year) = 4 AND year::integer >= 2001 AND year::integer < 2005 THEN
RETURN 'home_sales_2005_xx';
ELSE
RETURN 'home_sales_non_integer';
END IF;
END;
$$ LANGUAGE plpgsql;
And this was created successfully. Now I need to create a function and a trigger on the partition table to route the data successfully upon insert:
* Create the function
CREATE OR REPLACE FUNCTION insert_catalog_entry_fitments_partitioned()
RETURNS TRIGGER AS $$
BEGIN
INSERT INTO year_range_partition(NEW.year)
VALUES (NEW.*);
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER insert_home_sales_partitioned_trigger
BEFORE INSERT ON home_sales_partitioned
FOR EACH ROW
EXECUTE FUNCTION insert_home_sales_partitioned();
I am expecting this insert to work correctly. After the insert is complete, I can do:
drop table home_sales;
alter table home_sales_partitioned rename to home_sales.
But now, how to I tell the home_sales table that it is PARTITIONed? What is the command for that? And is this essentially a RANGE partition, or because it's a varchar field, is it some other type? I know that I must need to do this, since SELECTs won't be constructed properly if the table doesn't know it's own partitions.
WHEREconditions or so that you can discard old data withDROP TABLE. There is no way to partition a non-partitioned table except by copying most of the data around, which means a lengthy down time. There are other options, but your question contains too many things at once to be answered in depth.