ABSTRACT

The first statement creates a new schema to contain these tables. The schema name is arbitrary but might be chosen to be the name of the company or research organization. Next, the structure table is defined to contain a smiles column of type text. This column is defined to be unique and not null. The uniqueness constraint here ensures that no duplicate compounds can be entered. The id column is defined using the serial data type. This ensures that a unique integer number will be associated with each structure. This id will be used in other tables within this schema to relate data in those tables to compounds in the structure table. The cansmi column will be used to contain canonical simplified molecular input line entry system (SMILES). The fp column will be used to contain a bit string fingerprint of the structure. The cansmi and fp column can be used when searching for compounds in this table.