traditional RDBMS (relational database management system). Mainly used for relational data, it is object-oriented in nature.
useful operation like in dataframe
- Specify your own custom functions
- Use recursion to create complex looping and data generation with the WITH keyword.
- Sort string columns by a substring.
- Perform complex joining between multiple tables.
- Use SQL to generate SQL (automating tasks).
- Generate forecasts using statistical models.
- Create histograms.
- Build tree structures (with leaf, branch, root nodes)
select
SELECT CustomerName || ' LIVES IN ' || Address || ', ' || City AS location
FROM Customers
WHERE Country='Mexico'
if/else
SELECT Price,
CASE WHEN Price < 12 THEN 'Cheap'
WHEN Price < 21 THEN 'Regular'
ELSE 'Expensive'
END AS Bucket
FROM Products
random select
SELECT * FROM Products ORDER BY random() LIMIT 5
index
hash index
Starting at PostgreSQL 10 these limitations were resolved, and Hash indexes are no longer discouraged. Hash index outperforms the B-Tree index with a very slight difference, but it has a much smaller index size. limitation:
- Hash index cannot be used to enforce a unique constraint
- Hash index cannot be used to create indexes on multiple columns
- Hash index cannot be used to create sorted indexes
- can’t use a Hash index to cluster a table
- Hash index cannot be used for range lookups
- A Hash index cannot be used to satisfy ORDER BY queries
Creating Hash Indexes
#an example of URL shortener service(provides a short random URL that points to a longer URL)
#table between key and full url
CREATE TABLE shorturl (
id serial primary key,
key text not null,
url text not null
);
#create a B-Tree and a Hash index on both fields (key and full url)
CREATE INDEX shorturl_key_hash_index ON shorturl USING hash(key);
CREATE UNIQUE INDEX shorturl_key_btree_index ON shorturl USING btree(key);
CREATE INDEX shorturl_url_hash_index ON shorturl USING hash(url);
CREATE INDEX shorturl_url_btree_index ON shorturl USING btree(url);
Hash Index Size
The Hash index is smaller than the B-Tree index.
CREATE EXTENSION "uuid-ossp";
DO $$
BEGIN
FOR i IN 0..1000000 loop
INSERT INTO shorturl (key, url) VALUES (
uuid_generate_v4(),
'https://www.supercool-url.com/' || round(random() * 10 ^ 6)::text
);
if mod(i, 10000) = 0 THEN
RAISE NOTICE 'rows:% Hash key% B-Tree key:% Hash url:% B-Tree url:%',
to_char(i, '9999999999'),
to_char(pg_relation_size('shorturl_key_hash_index'), '99999999999'),
to_char(pg_relation_size('shorturl_key_btree_index'), '99999999999'),
to_char(pg_relation_size('shorturl_url_hash_index'), '99999999999'),
to_char(pg_relation_size('shorturl_url_btree_index'), '99999999999');
END IF;
END LOOP;
END;
$$;
dynamic hash table
the structure of a hash table is as follows:
When the hash table needs to be expanded, it is added one bucket at a time, but the memory space is applied with many buckets as a segment at one time. This avoids the performance degradation caused by frequent application space. After extending a bucket, we only need to traverse the elements in the corresponding one old bucket, and then move the elements to be moved to the new bucket.
usage tips
postgresql cluster
Create a unified management, flexible cloud-native production deployment to deploy a personalized database as a service (DBaaS) on k8s. (https://github.com/CrunchyData/postgres-operator)
#config postgresql operator
wget <https://raw.githubusercontent.com/CrunchyData/postgres-operator/master/examples/quickstart.sh>
chmod +x ./quickstart.sh
./examples/quickstart.sh
#create postgresql cluster
pgo create cluster mynewcluster
pgo test mynewcluster
Postgresql operator
introduction and usage of the postgresql operator is here: https://operatorhub.io/operator/postgres-operator. a graphical user interface called postgres-operator-ui is available. different PostgreSQL roles can be defined wtih Teams API. some customized postgresql operator is also available, such as Stolon, Crunchy Data, Zalando, KubeDB, StackGres and https://github.com/flant/postgres-operator (comparision: https://blog.flant.com/comparing-kubernetes-operators-for-postgresql/) to deploy the postgresql operator following deployment manifest can be used:
apiVersion: acid.zalan.do/v1
kind: postgresql
metadata:
name: staging-db
spec:
numberOfInstances: 3
patroni:
synchronous_mode: true
postgresql:
version: "12"
resources:
limits:
cpu: 100m
memory: 1Gi
requests:
cpu: 100m
memory: 1Gi
sidecars:
- env:
- name: DATA_SOURCE_URI
value: 127.0.0.1:5432
- name: DATA_SOURCE_PASS
valueFrom:
secretKeyRef:
key: password
name: postgres.staging-db.credentials
- name: DATA_SOURCE_USER
value: postgres
image: wrouesnel/postgres_exporter
name: prometheus-exporter
resources:
limits:
cpu: 500m
memory: 100Mi
requests:
cpu: 100m
memory: 100Mi
teamId: staging
volume:
size: 2Gi
monitoring postgres on kubernetes
#deploy monitoring pod
kubectl apply -f https://raw.githubusercontent.com/CrunchyData/postgres-operator/v4.5.0/installers/metrics/kubectl/postgres-operator-metrics.yml
#collect data
pgo create cluster hippo --metrics --replica-count=1
pgo create cluster rhino --metrics --replica-count=1
pgo create cluster zebra --metrics
Postgres Notify for Real Time Dashboards
with the NOTIFY and LISTEN commands we can implement something akin to the Observer pattern in SQL engine. The Observer pattern allows one class of object to “listen” for incoming events and another class to send events to those listeners.
Oberver pattern
observers can be created to watch for changes and immediately update the state and UI of an app.
#create production item work in progress
create table production_item_wip (
id serial primary key,
insert_time timestamp default NOW(),
production_item_id int references production_item(id),
production_stage_id int references production_stage(id),
employee_id int references employee(id)
);
#create NOTIFY syntax
create
or replace function fn_production_stage_modified() returns trigger as $psql$
begin
perform pg_notify(
'order_progress_event',
'Time to refresh those screens!'
);return new;
end;$psql$ language plpgsql;
#create trigger
create trigger production_stage before
insert
on production_item_wip for each row execute procedure fn_production_stage_modified();
LISTEN pattern
Listening pattern is like this: LISTEN order_progress_event;
#create view showing how many products have progressed through each production stage today
create view view_daily_production_stats as
select
count(1) as stage_count,
ps.name as stage_namefrom production_item_wip piw
join production_stage ps on ps.id = piw.production_stage_idwhere date(piw.insert_time) = date(now())
group by
ps.id
#callback function
var clients = [];
function eventCallback(event) {
query('select * from view_daily_production_stats', (data) => {
clients.map(c => {
c.send(data);
});
});
}
client.connect(function(err, client) {
var query = client.query("LISTEN order_progress_event");
client.on("notification", eventCallback);
});
;