Using
Suggest editsThese use cases show how you can use EDB Postgres Advanced Server with Chemaxon JChem PostgreSQL Cartridge.
Create table with MOLECULE datatype column
Using SQL, create the table in which a column has the MOLECULE datatype to store and process chemical data:
CREATE TABLE public.chemical_data ( empno INT4 Primary Key, ename VARCHAR(50), location CLOB, mol MOLECULE('sample') );
View the structure of this table:
DESC public.chemical_data; Table "public.chemical_data" Column | Type | Collation | Nullable | Default ----------+-----------------------+-----------+----------+--------- empno | integer | | not null | ename | character varying(50) | | | location | clob | | | mol | molecule('sample') | | | Indexes: "chemical_data_pkey" PRIMARY KEY, btree (empno)
Insert/update/delete chemical data
Using SQL statements, insert data, including chemical data, into the table:
INSERT INTO public.chemical_data VALUES (10,'EMP1','COUNTRY1','c1ccccc1'); INSERT INTO public.chemical_data VALUES (20,'EMP2','COUNTRY2','O=Cc1ccccc1'); INSERT INTO public.chemical_data VALUES (30,'EMP3','COUNTRY3','C=CCOC=O'); INSERT INTO public.chemical_data VALUES (40,'EMP4','COUNTRY4','c1nc2cncnc2n1');
Use a SELECT statement to retrieve the data from the table, including the chemical data in the column mo1:
SELECT * FROM public.chemical_data; empno | ename | location | mol -------+-------+----------+--------------- 10 | EMP1 | COUNTRY1 | c1ccccc1 20 | EMP2 | COUNTRY2 | O=Cc1ccccc1 30 | EMP3 | COUNTRY3 | C=CCOC=O 40 | EMP4 | COUNTRY4 | c1nc2cncnc2n1 (4 rows)
Update and delete data from the table, including the chemical data:
UPDATE public.chemical_data SET ename = 'EMP#1',mol='C=CCOC=O' WHERE empno = 20; SELECT * FROM public.chemical_data; empno | ename | location | mol -------+-------+----------+--------------- 10 | EMP1 | COUNTRY1 | c1ccccc1 20 | EMP#1 | COUNTRY2 | O=Cc1ccccc1 30 | EMP3 | COUNTRY3 | C=CCOC=O 40 | EMP4 | COUNTRY4 | c1nc2cncnc2n1 (4 rows) DELETE FROM public.chemical_data WHERE empno = 30; SELECT * FROM public.chemical_data; empno | ename | location | mol -------+-------+----------+--------------- 10 | EMP1 | COUNTRY1 | c1ccccc1 30 | EMP3 | COUNTRY3 | C=CCOC=O 40 | EMP4 | COUNTRY4 | c1nc2cncnc2n1 (3 rows)
Create indexes in molecule columns
For indexing a column containing chemical structures, the following indextypes are provided:
- chemindex - sortedchemindex
This example creates indexes in the Molecule columns on the table:
CREATE TABLE public.chemical_data ( empno INT4, ename VARCHAR(50), location CLOB, mol1 MOLECULE('sample') NOT NULL, mol2 MOLECULE('sample') ); CREATE INDEX chmcal_index1 ON public.chemical_data USING chemindex(mol1); CREATE INDEX chmcal_index2 ON public.chemical_data USING sortedchemindex(mol2); DESC public.chemical_data; Table "public.chemical_data" Column | Type | Collation | Nullable | Default ----------+-----------------------+-----------+----------+--------- empno | integer | | | ename | character varying(50) | | | location | clob | | | mol1 | molecule('sample') | | not null | mol2 | molecule('sample') | | | Indexes: "chmcal_index1" chemindex (mol1) "chmcal_index2" sortedchemindex (mol2)
Searching for data from the Molecule column
You can perform substructure, duplicate, superstructure, and full-fragment searches.
Substructure search
Substructure search finds all structures that contain the query structure as a subgraph. Sometimes the chemical subgraph isn't the only thing that's provided. Certain query features that further restrict the structure are also provided. If special molecular features are present on the query, such as stereochemistry or charge, then only those targets match the ones that also contain the feature. However, if a feature is missing from the query, it doesn't need to be missing, by default.
Perform substructure search using the symmetrical sub-/super-structure search operator |<|
.
This SQL statement creates a table and inserts data:
CREATE TABLE public.chemical_data ( empno INT4, ename VARCHAR(50), location CLOB, mol1 MOLECULE('sample') NOT NULL, mol2 MOLECULE('sample') ); INSERT INTO public.chemical_data VALUES (10,'EMP1','COUNTRY1','c1ccccc1','C=CCOC=O'); INSERT INTO public.chemical_data VALUES (20,'EMP2','COUNTRY2','O=Cc1ccccc1','C=CCOC=O'); INSERT INTO public.chemical_data VALUES (30,'EMP3','COUNTRY3','C=CCOC=O','c1nc2cncnc2n1'); INSERT INTO public.chemical_data VALUES (40,'EMP4','COUNTRY4','c1nc2cncnc2n1',NULL);
These SQL statements search data from the Molecule column:
SELECT * FROM public.chemical_data WHERE 'C=CCOC=O'|<| mol1 AND mol2 IS NOT NULL ORDER BY mol1; empno | ename | location | mol1 | mol2 -------+-------+----------+----------+--------------- 30 | EMP3 | COUNTRY3 | C=CCOC=O | c1nc2cncnc2n1 (1 row) SELECT * FROM public.chemical_data WHERE 'cncn'|<| mol1 ORDER BY mol1 LIMIT 2; empno | ename | location | mol1 | mol2 -------+-------+----------+---------------+------ 40 | EMP4 | COUNTRY4 | c1nc2cncnc2n1 |
Duplicate search
Use duplicate searches mainly before database inserts to check whether the given molecule is already in the database.
Perform duplicate searches using the |=|
operator.
This SQL statement searches data from the Molecule column:
SELECT * FROM public.chemical_data WHERE 'C=CCOC=O' |=| mol1; empno | ename | location | mol1 | mol2 -------+-------+----------+----------+--------------- 30 | EMP3 | COUNTRY3 | C=CCOC=O | c1nc2cncnc2n1 (1 row)
Superstructure search
Superstructure searches find all molecules for which the query is the superstructure of the target.
Perform superstructure searches using the sub-/super-structure search operator |>|
.
This SQL statement searches data from the Molecule column:
SELECT * FROM public.chemical_data WHERE 'C=CCOC=O'|>| mol1; empno | ename | location | mol1 | mol2 -------+-------+----------+----------+--------------- 30 | EMP3 | COUNTRY3 | C=CCOC=O | c1nc2cncnc2n1 (1 row)
Full-fragment (exact fragment) search
Full-fragment searches are between substructure searches and full searches. The query must fully match a whole fragment of the target. Other fragments might be present in the target, but they are ignored.
This SQL statement searches data from the Molecule column:
SELECT * FROM public.chemical_data WHERE 'C=CCOC=O'|=>| mol1 ORDER BY empno LIMIT 2; WARNING: am_functions.cpp:458 Index scan of reverse full fragment search is not supported. empno | ename | location | mol1 | mol2 -------+-------+----------+----------+--------------- 30 | EMP3 | USA | C=CCOC=O | c1nc2cncnc2n1 (1 row)
Import data using COPY
You can import data into the table that has a Molecule column using the COPY command. For example:
Create a
.csv
file containing comma-separated data:cat /home/edb/Desktop/m_data.csv 10,EMP1, COUNTRY1, c1ccccc1 20,EMP2, COUNTRY2, O=Cc1ccccc1 30,EMP3, COUNTRY3, C=CCOC=O 40,EMP4, COUNTRY4, c1nc2cncnc2n1
Using this SQL statement, create the table:
CREATE TABLE public.chemical_data ( empno INT4, ename VARCHAR(50), location CLOB, mol1 MOLECULE('sample') );
Load data into the table:
COPY public.chemical_data FROM '/home/edb/Desktop/m_data.csv' (FORMAT csv);
You can then fetch data from the table using the following SQL statement:
SELECT * FROM public.chemical_data ORDER BY empno; empno | ename | location | mol1 -------+-------+----------+--------------- 10 | EMP1 | COUNTRY1 | c1ccccc1 20 | EMP2 | COUNTRY2 | O=Cc1ccccc1 30 | EMP3 | COUNTRY3 | C=CCOC=O 40 | EMP4 | COUNTRY4 | c1nc2cncnc2n1 (4 rows)
Could this page be better? Report a problem or suggest an addition!