I'm new to databases and reading the Postgres documentation, it seems to mention that data is stored on disk, which seems to imply that data is only stored on one machine. Is that correct?
-
How else would you expect it to work? If you're thinking of "Cloud-hosted databases" you need to remember that there is no such thing as The Cloud: it's all just someone else's computer.Dai– Dai2020-10-02 05:35:18 +00:00Commented Oct 2, 2020 at 5:35
-
As for distributed databases that's something else entirely: have fun dealing with the CAP theorem. At a local-level, distributed databases still work on the same basis as Postgres, just that each individual machine either has a replicated copy of the entire database or has a subset of it.Dai– Dai2020-10-02 05:37:51 +00:00Commented Oct 2, 2020 at 5:37
-
1Define "machine". All databases store data on disk. That "disk" though may be a highly redundant RAID array or a SAN (storage area network). Except for development machines, the data is never stored on just a single disk and for high performance servers, the "disks" are on their own machinePanagiotis Kanavos– Panagiotis Kanavos2020-10-02 05:51:48 +00:00Commented Oct 2, 2020 at 5:51
-
Then there's replication and clustering - the data from one server is replicated across multiple servers in a cluster so if one goes down, another can start serving requests for that databasePanagiotis Kanavos– Panagiotis Kanavos2020-10-02 05:56:35 +00:00Commented Oct 2, 2020 at 5:56
Add a comment
|
1 Answer
Yes, your understanding is correct.
PostgreSQL does not offer a distributed solution (e.g. shared nothing). There are forks (Greenplum, Postgres-XL) and extension (Citus) that can distribute storage across multiple servers, but it's not available natively inside the "vanilla" PostgreSQL version.
You can access and write data on different Postgres servers through a foreign data wrapper, but that's not exactly the same as a proper distributed solution (e.g. foreign tables don't participate correctly in transactions)
2 Comments
Panagiotis Kanavos
I don't think the OP asks for distributed databases. A db cluster would store the data on multiple machines even though only one server would serve data for a single database at a time, and that's just the OOTB replication functionalituy