ABSTRACT

VRI,Q6LOLFR$SSURDFKHV H0RGHOV OVRQ7\SH0RGHOV DVHG$OJRULWKPV IRU3%3.0RGHO3DUDPHWHUV DFKHVIRU7LVVXH$LU3DUWLWLRQ&RHIÀFLHQWV DFKHVIRU%ORRG$LU3&V DFKHVIRU7LVVXH%ORRG3&V DFKHVIRU3URWHLQ%LQGLQJ DFKHVIRU&OHDUDQFH&RQVWDQWV DFKHVIRU6NLQ3HUPHDELOLW\&RQVWDQWV DFKHVIRU2UDO$EVRUSWLRQ&RQVWDQWV $SSURDFKHVLQWR5LVN$VVHVVPHQW 6$5VIRU&KORURHWKDQHV H²:LOVRQ46$5VLQWR3%3.0RGHOV LVN$VVHVVPHQWRI0HWK\O&KORURIRUP XUH'LUHFWLRQV

3K\VLRORJLFDOO\ XVHGIRUFRQGXFWLQJ DQGH[SRVXUHVFHQDU 3%3.PRGHO WKH SURFHVVHV RI DE RUJDQLVP.ULVKQDQ FRQVWLWXWLQJ WKH 3% SDUDPHWHUVQDPHO\ DOYHRODU YHQWLODWLRQ VXHEORRGSDUWLWLRQF DQG ELRFKHPLFDO P :KHUHDV WKH LQIRUP ELRPHGLFDO OLWHUDWXUH SK\VLFRFKHPLFDOSD ÀFLHQWDQGELRFKHP RIPHWDEROLVPDQG

7KH SK\VLFRFKH FKHPLFDOVSHFLÀF 3 DSSURDFKHV ,Q YLYR H[SRVHGDQLPDOVDQ PRGHO VLPXODWLRQV PLVVLQJ SDUDPHWHUV HVWLPDWLQJ RQH RU W VWXGLHVSDUWLFXODUO\ RIDQLPDOV.ULVKQD

7KHLQYLWURPHW XVHIXORQO\IRUHVWLP FRQVWDQWV FDQQRW E IUHVKO\LVRODWHGKHS SURPLVH.ULVKQDQD DVZHOO DV UHGXFWLRQ RWKHU DOWHUQDWLYH DS 3%3. PRGHO SDUDP FRQWH[W7KHÀUVWRQ HWHUV LQRUGHU WRGH WKH PDJQLWXGH RI WK TXDQWLWDWLYH VWUXFWX DSSURDFKLQYROYHVWK VWDQGLQJRIWKHLQWHU QDQWV LQ RUGHU WR S REMHFWLYHV RI WKLV DSSURDFKHV46$5V SDUDPHWHUVDQG LQKXPDQKHDOWKULV

INTRODUCTION

EDVHGSKDUPDFRNLQHWLF 3%3.PRGHOV DUH LQFUHDVLQJO\EHLQJ GRVHH[WUDSRODWLRQVURXWHH[WUDSRODWLRQVVSHFLHVH[WUDSRODWLRQV LRH[WUDSRODWLRQVUHTXLUHGIRUULVNDVVHVVPHQWV$QGHUVHQHWDO VDUHEDVLFDOO\PHFKDQLVPEDVHGPDWKHPDWLFDOGHVFULSWLRQVRI VRUSWLRQ GLVWULEXWLRQ PHWDEROLVP DQG H[FUHWLRQ LQ WKH LQWDFW DQG$QGHUVHQ7KHDOJHEUDLFDQGGLIIHUHQWLDOHTXDWLRQV 3. PRGHOV DUH VROYHG ZLWK WKH NQRZOHGJH RI YDULRXV LQSXW SK\VLRORJLFDOWLVVXHYROXPHVEORRGÁRZUDWHVFDUGLDFRXWSXW UDWH SK\VLFRFKHPLFDO EORRGDLU SDUWLWLRQ FRHIÀFLHQWV WLV RHIÀFLHQWVDEVRUSWLRQUDWHFRQVWDQWVSHUPHDELOLW\FRHIÀFLHQWV D[LPDO YHORFLW\ 0LFKDHOLV DIÀQLW\ FRQVWDQW SDUDPHWHUV DWLRQ RQ SK\VLRORJLFDO SDUDPHWHUV FDQ EH REWDLQHG IURP WKH $UPV DQG7UDYLV WKLV LV IUHTXHQWO\ QRW WKH FDVH IRU UWLWLRQFRHIÀFLHQWVDEVRUSWLRQFRQVWDQWVDQGSHUPHDELOLW\FRHI LFDOSDUDPHWHUVKHSDWLFRUUHQDOFOHDUDQFHVPD[LPDOYHORFLW\ 0LFKDHOLVDIÀQLW\FRQVWDQW PLFDO DQG ELRFKHPLFDO SDUDPHWHUV QHHGHG IRU FRQVWUXFWLQJ %3. PRGHOV FDQ EH REWDLQHG XVLQJ LQ YLYR RU LQ YLWUR DSSURDFKHV LQYROYH FROOHFWLRQ RI SKDUPDFRNLQHWLF GDWD LQ GDQDO\VLVRIVXFKGDWDXVLQJD3%3.PRGHO%\DGMXVWLQJWKH WRPDWFK WKH H[SHULPHQWDO GDWD WKH QXPHULFDO YDOXHV RI WKH FDQ EH HVWLPDWHG 6XFK D SURFHGXUH LV UHOLDEO\ DSSOLHG IRU ZR SDUDPHWHUV DW D WLPH 3DUDPHWHU HVWLPDWLRQ XVLQJ LQ YLYR IRUQRQYRODWLOHVFDQEHWHGLRXVDQGFDQUHTXLUHH[WHQVLYHXVH QDQG$QGHUVHQ KRGVIDFLOLWDWLQJUHGXFHGDQLPDOXVHKDYHEHHQSURYHQWREH DWLQJSDUWLWLRQFRHIÀFLHQWV7KH LQYLWURGHULYHGPHWDEROLVP

H GLUHFWO\ LQFRUSRUDWHG ZLWKLQ 3%3. PRGHOV HYHQ WKRXJK DWRF\WHVDQGSRVWPLWRFKRQGULDOIUDFWLRQVDSSHDUWRKROGVRPH QG$QGHUVHQ7KHFRQVLGHUDWLRQVRIFRVWHIIHFWLYHQHVV UHSODFHPHQW RI DQLPDO XVH KDYH OHG WR WKH GHYHORSPHQW RI SURDFKHV SDUWLFXODUO\ LQ VLOLFR DSSURDFKHV IRU HVWLPDWLQJ HWHUV 7ZR NLQGV RI LQ VLOLFR DSSURDFKHV DUH XVHIXO LQ WKLV HLQYROYHVWKHXVHRIDYDLODEOHGDWDIRUYDULRXV3%3.SDUDP YHORSHTXDWLRQV WKDWDVVRFLDWHFKDUDFWHULVWLFVRIFKHPLFDOV WR H SDUDPHWHUV$Q H[DPSOH RI WKLV FDWHJRU\ LV WKH FODVVLFDO UHDFWLYLW\ UHODWLRQVKLS 46$5 DSSURDFK $QRWKHU LQ VLOLFR HGHYHORSPHQWRIPHFKDQLVWLFDOJRULWKPVEDVHGRQDQXQGHU UHODWLRQVKLSVDPRQJFHUWDLQELRORJLFDODQGFKHPLFDOGHWHUPL

UHGLFW WKH QXPHULFDO YDOXH RI 3%3.PRGHO SDUDPHWHUV 7KH FKDSWHU DUH WR UHYLHZ WKH VWDWHRIWKH DUW RI LQ VLOLFR ELRORJLFDOO\EDVHGDOJRULWKPV IRUHVWLPDWLQJ3%3.PRGHO WRLOOXVWUDWHKRZWKHLQVLOLFR²EDVHG3%3.PRGHOVFDQEHXVHG NDVVHVVPHQWDSSOLFDWLRQV

METHODO

7KH IROORZLQJ S EDVLVRI WKH WZR W\S EDVHGDOJRULWKPV

QSARs

7KH46$5VW\SLF LQWKLVFRQWH[W WRVW IXQFWLRQI

6LQFHHPSLULFDOG DERYH UHODWLRQVKLS OLQHDUPXOWLOLQHDUR WKH YDOXH RI 3%3. )UHH²:LOVRQPRGHOV

LFE-Type Models

/)(W\SHPRGHOV RIFKHPLFDOVWUXFWXUH )XMLWD7KHED HQFHV LQ PDJQLWXGH FRUUHVSRQG WR FKDQJ WKHGLIIHUHQFH LQELR OLQHDUO\ UHODWHG WKH IUHHHQHUJ\UHODWLRQV ELRORJLFDOV\VWHPVLW (+DQGHQWURS\( GHVFULSWRUVWKDWFDQE 7KHFRHIÀFLHQWVIRUW XVLQJVWDQGDUGVWDWLVW FDQEHEURDGO\FODVV ELF /)( PRGHOV FD GHVFULSWRUVEDVHGRQ

Electrostatic Featu

(OHFWURQLFHIIHFW FLHVSDUWLDODWRPLFF VLJPDWYDOXHVUHV DQG 7DIW VXEVWLWXHQW WKURXJKELRORJLFDOP

LOGICAL BASIS OF IN SILICO APPROACHES

DUDJUDSKV SURYLGH D EULHI GHVFULSWLRQ RI WKHPHWKRGRORJLFDO HVRI LQ VLOLFR DSSURDFKHVQDPHO\46$5VDQGELRORJLFDOO\

DOO\UHODWHDELRORJLFDODFWLYLW\RUPRUHVSHFLÀFDOO\DSURSHUW\ UXFWXUDOIHDWXUHVVSHFLÀFWRFKHPLFDOVWKURXJKDPDWKHPDWLFDO

%LRORJLFDOSURSHUW\ IVWUXFWXUDOIHDWXUH

DWDDUHXVHGWRGHULYHWKHPDWKHPDWLFDOIXQFWLRQIXOÀOOLQJWKH GHSHQGLQJ XSRQ WKH QDWXUH RI WKH GDWD WKH IXQFWLRQV FDQ EH UVXSUDOLQHDU7ZRW\SHVRI46$5VKDYHEHHQXVHGWRHVWLPDWH PRGHO SDUDPHWHUV OLQHDUIUHH HQHUJ\ /)( PRGHOV DQG

DUHTXDQWLWDWLYHUHODWLRQVKLSVWKDWGHVFULEHDFWLYLW\DVDIXQFWLRQ UHO\LQJXSRQWKHSULQFLSOHVRIWKHUPRG\QDPLFV+DQVFKDQG VLVRIWKHFRPPRQO\XVHG+DQVFKDSSURDFKLVWKDWWKHGLIIHU RI D JLYHQ ELRORJLFDO DFWLYLW\ ZLWKLQ D VHULHV RI FKHPLFDOV HV LQ WKH IUHH HQHUJ\ (* GXULQJ WKH SURFHVVHV LQYROYHG$V ORJLFDODFWLYLW\DQG WKHFKDQJH LQIUHHHQHUJ\DUH OLNHO\ WREH UHVXOWLQJPDWKHPDWLFDO UHODWLRQVKLSVDUH UHIHUUHG WRDV ´OLQHDU KLSVµ%HFDXVH LW LVYHU\GLIÀFXOW WRGLUHFWO\GHWHUPLQH(* LQ VWKHUPRG\QDPLFFRPSRQHQWVVXFKDVWKHHQHUJ\((HQWKDOS\ 6DUHXVHGLQVWHDGDQGDUHUHSUHVHQWHGE\DVHULHVRIVWUXFWXUDO HGHULYHGIRUDQ\JLYHQPROHFXOH6H\GHODQG6FKDSHU KHVHGHVFULSWRUVLHVORSHVDQGLQWHUFHSWVDUHWKHQUHJUHVVHG LFDOWHFKQLTXHV,QWKLVW\SHRIDSSURDFKVWUXFWXUDOGHVFULSWRUV LÀHGLQWRWKUHHJHQHUDOW\SHVHOHFWURVWDWLFVWHULFRUK\GURSKR Q LQFRUSRUDWH RQH RU PDQ\ RI WKHVH FDWHJRULHV RI VWUXFWXUDO WKHVWDWLVWLFDOVLJQLÀFDQFHRIHDFKIHDWXUHLQWKHÀQDOPRGHO

res in LFE-Type Models

VW\SLFDOO\LQFOXGHHOHFWURQGRQDWLQJDQGZLWKGUDZLQJWHQGHQ KDUJHVDQGHOHFWURVWDWLFÀHOGGHQVLWLHVDVGHÀQHGE\+DPPHWW RQDQFHSDUDPHWHUV5YDOXHVLQGXFWLYHSDUDPHWHUV)YDOXHV YDOXHV V W (V %HFDXVH LRQL]HGPROHFXOHV FDQQRW SDVV HPEUDQHVDQGHOHFWURVWDWLFHIIHFWVFRQVWDQWVDUHGHULYHGIURP

LRQL]DWLRQFKDUDFWHUL FKHPLFDOVWRRQO\HO HUVE\XVHGGL IRU UHODWLQJ VWUXFWXU FRHIÀFLHQWVRIDVHUL NLGQH\PXVFOHEUDLQ GXH WR LQFUHDVLQJ OL UHOHYDQWRQO\IRUIXQ 'LFNLQV KDY LRQL]DWLRQSRWHQWLDO DQGELQGLQJWR&<3 GHVFULSWRUVHVSHFLDO ELQGLQJVLWHRIWKHFR

Steric Features in

6WHULFHIIHFWVDUH WLYLW\DQGWKH7DIWVWH QHVVµRIWKHPROHFXO DUHDFDUERQFKDLQEU IURP WKUHHGLPHQVLR DOWKRXJKWKHUHODWLRQ DVLQWXLWLYHLQFHUWDL IRUWKHVHIHDWXUHVRIW

*DUJDVHWDO PRGHOSDUDPHWHUYDO WLYLW\LQGLFHVDQGDG SURYLGHG EHWWHU GHV GHVFULSWRUV DORQH 8 EHWZHHQWKHVHGHVFUL PDQQHU )RU H[DPS VWUXFWXUDODQGHOHFWU UHSUHVHQWPXOWLSOHVX SRXQGVÁH[LELOLW\LQ RWKHUGHVFULSWRUXVHG SUHVHQWLQWKHPROHF LQWHUHVW*DUJDVHWDO YHORFLW\ RI PHWDERO FRQWDLQHGLQWKHFRQ WKDWPRUHDFFXUDWHP WRUVDQGPRUHVSHFLÀ

Hydrophobic Featu

+\GURSKRELFIHD WKHORJRFWDQROZDWH

VWLFVRIWKHPROHFXOHUHODWLQJSKDUPDFRNLQHWLFEHKDYLRURIDOO HFWURVWDWLFIHDWXUHVLVQRWWRWDOO\UHOHYDQW$EUDKDPDQG:HDWK SRODULW\SRODUL]DELOLW\DPRQJRWKHUVDVHOHFWURVWDWLFGHVFULSWRUV H WR WKH QXPHULFDO YDOXHV RI WLVVXHDLU DQG EORRGDLU SDUWLWLRQ HVRIFKHPLFDOV7KH\REVHUYHGWKDWZDWHUSODVPDEORRGOXQJ IDWDQGROLYHRLOSURJUHVVLYHO\EHFDPHOHVVGLSRODUSRODUL]DEOH SLG FRQWHQW WR 7KXV WKH HOHFWURVWDWLF GHVFULSWRU LV FWLRQDOO\VXEVWLWXWHGFRPSRXQGVVXFKDVSURSDQRO/HZLVDQG H VKRZQ WKH LPSRUWDQFH RI HOHFWURVWDWLF GHVFULSWRUV VXFK DV DQGS.DLQUHODWLQJVWUXFWXUHRIYDULRXVGUXJVWRPHWDEROLFUDWHV &RUUHODWLRQVZHUHLPSURYHGZLWKWKHLQFRUSRUDWLRQRIWKHVH O\IRU LRQL]DEOHRUSRODUGUXJV7KLV LVSUREDEO\UHODWHGWRWKH PSRXQGRQWKH&<3SURWHLQZKLFKFRQWDLQVSRODUDPLQRDFLGV

LFE-Type Models

FRQYHQWLRQDOO\UHSUHVHQWHGE\YDOXHVFDOFXODWHGIRUPRODUUHIUDF ULFSDUDPHWHU+RZHYHUVLQFHVWHULFHIIHFWVGHVFULEHWKH´EXONL HWKH\FDQLQFOXGHPROHFXODUYROXPHPROHFXODUZHLJKWVXUIDFH DQFKLQJHWF0ROHFXODUFRQQHFWLYLW\LQGLFHVDQGIHDWXUHVGHULYHG QDO 46$5 FDQ DOVR EH FRQVLGHUHG DV EHLQJ RI D VWHULF QDWXUH VKLSEHWZHHQVWUXFWXUHDQGWKHVHIHDWXUHVLVRIWHQREVFXUHDQGQRW QLQVWDQFHV)XUWKHUPRUHREWDLQLQJWKHFKHPLFDOVSHFLÀFYDOXHV HQUHTXLUHVWKHXVHRIVSHFLDOL]HGFKHPLFDOPRGHOLQJVRIWZDUH XVHGVWHULFGHVFULSWRUVLQRUGHUWRUHODWHVWUXFWXUHWR3%3. XHVDQGWKH\VXJJHVWHGWKDWD/)(HTXDWLRQFRPELQLQJFRQQHF KRFGHVFULSWRUVVXFKDVWKHQXPEHURIKDORJHQVLQDFRPSRXQG FULSWLRQV IRU WLVVXHDLU SDUWLWLRQ FRHIÀFLHQWV WKDQ XVLQJ HLWKHU VH RI FRQQHFWLYLW\ LQGLFHV LV OLPLWHG EHFDXVH WKH UHODWLRQVKLS SWRUVDQGVWUXFWXUHLVQRWLQIRUPDWLYHLQDWUDQVSDUHQWDQGGLUHFW OH WKH ÀUVW RUGHU YDOHQFH FRQQHFWLYLW\ LQGH[ UHSUHVHQWV ERWK RQLF IHDWXUHVRIDFRPSRXQG LQDFRPSOH[ZD\2UGHU LQGLFHV EVWLWXWLRQSDWWHUQVRI WKHKDORJHQVRQWKHFDUERQV LQ WKHFRP SRO\PHUVRUKDORJHQVXEVWLWXWLRQSDWWHUQV0RUHLQWXLWLYHLVWKH E\*DUJDVHWDOQDPHO\WKHQXPEHURIKDORJHQDWRPV XOHDOWKRXJKLWPD\QRWDOZD\VEHUHOHYDQWIRUDOOPROHFXOHVRI DOVRXVHGVWHULFSDUDPHWHUVWRUHODWHVWUXFWXUHWRPD[LPDO LVP 9PD[ EXW EHFDXVH RI WKH OHYHO RI HOHFWURQLF LQIRUPDWLRQ QHFWLYLW\LQGLFHVXVHGLQWKHGHVFULSWLRQVWKHDXWKRUVVXJJHVWHG RGHOLQJRI9PD[FRXOGEHDWWHPSWHGE\XVLQJERWKVWHULFGHVFULS FHOHFWURQLFLQIRUPDWLRQVXFKDVFKDUJHGLVWULEXWLRQ

res in LFE-Type Models

WXUHVLQ/)(W\SHHTXDWLRQVDUHIUHTXHQWO\UHSUHVHQWHGE\XVLQJ USDUWLWLRQFRHIÀFLHQWORJ3RZRUWKHK\GURSKRELFSDUDPHWHU

TZKLFKLVGHULYHG >3ZD@ RLODLU >3RD@ VROXELOLW\ SDUDPHWHU RFWDQRODLURLODLURU 3%3. PRGHO SDUDP PXOWLSOHVSHFLHVUDW YDOXHV IRU3RZ3RD 3& UHSUHVHQW GLVWU ZDWHUDQGDLU LW LV SDUWLWLRQPHDVXUHIRU VLQFH WLVVXH LV FRPS GHVFULSWRUVKDYHEHHQ HUVE\5HFHQW RIWKHOLWHUDWXUHIRXQ DQDO\VLVXVLQJ3ZD3 HVVHQWLDOO\ WKH VDPH LPSRUWDQFHRIWLVVXH

Free-Wilson Type

$OWKRXJKWKHUHOD ELFGHWHUPLQDQWVKDY VKLSV XVLQJ RWKHU G )XUWKHUPRUHLW LVRI UHODWH WR3.SURFHVV FRPELQDWLRQVDVXEV WKLVDOWHUQDWLYHVWRW :LOVRQ GHYH DFWLYLW\ZLWKWKHQDW LQWKHSDUHQWPROHFX HTXDWLRQ

ZKHUH $ LV GHÀQHG FRQWULEXWLRQWRDFWLYL RUDEVHQFH

7KH )UHH²:LOVRQ DGGLWLYHDQGWKDWDVX QDWLRQRIWKHFRQWULE )UHH²:LOVRQ W\SH 4 .DRUDORIVXOIRQDP VWDQWVGHULYHGLWZDV E\ FRPELQLQJ WKH )UHH²:LOVRQ DOJRULW VHULHVRIFKORURHWKDQ

IURP3RZ+RZHYHURWKHUSDUWLWLRQFRHIÀFLHQWVHJZDWHUDLU RLOZDWHU QKH[DGHFDQHDLU >3KHD@ SDUWLWLRQ FRHIÀFLHQWV DQG V KDYH DOVR EHHQ XVHG +\GURSKRELF SDUDPHWHUV QDPHO\ ZDWHUDLUKDYHEHHQH[WHQVLYHO\XVHGIRUUHODWLQJVWUXFWXUHDQG HWHUV 7KH XVH RI YDULRXV GDWDVHWV DQG H[SHULPHQWDO GDWD LQ KXPDQRUÀVKLQWKHUHJUHVVLRQVKDYHOHGWRYDULHGFRHIÀFLHQW RU3ZD%HFDXVHEORRGDLU DQG WLVVXHDLUSDUWLWLRQFRHIÀFLHQWV LEXWLRQ EHWZHHQ D ELRORJLFDO PDWUL[ FRQVLVWLQJ RI OLSLG DQG ORJLFDO WKDW UHDVRQDEO\JRRGFRUUHODWLRQVDUHREWDLQHGXVLQJD DQRWKHUUHOHYDQWPDWUL[DQGDLULH3ZDRU3RD)XUWKHUPRUH RVHG RI ZDWHU OLSLGV DQG SURWHLQ WKH FRHIÀFLHQWV RI WKHVH VXJJHVWHGWRUHÁHFWWLVVXHFRPSRVLWLRQ$EUDKDPDQG:HDWK O\0HXOHQEHUJDQG9LMYHUEHUJDIWHUDQH[WHQVLYHUHYLHZ GWKDWYDOXHVRIWKHFRHIÀFLHQWVREWDLQHGIROORZLQJUHJUHVVLRQ

RDDQGWKHH[SHULPHQWDO3&YDOXHVIRUUDWVDQGKXPDQVZHUH DV WKH WLVVXH OLSLG ZDWHU DQG OLSLG FRQWHQW KLJKOLJKWLQJ WKH FRPSRVLWLRQLQWKHSDUWLWLRQLQJSURFHVV

Models

WLRQVKLSVEHWZHHQSKDUPDFRNLQHWLFSDUDPHWHUVDQGK\GURSKR HEHHQH[SORUHGIUHTXHQWO\WKHGHYHORSPHQWRIVXFKUHODWLRQ HWHUPLQDQWV LQ WKH /)( DSSURDFK LV QRW DV VWUDLJKWIRUZDUG WHQXQFOHDUKRZWKHVHGHWHUPLQDQWVDVXVHGLQ/)(HTXDWLRQV HVDQG LQRUGHU WRVXFFHVVIXOO\H[SORUHDOO UHOHYDQWVWUXFWXUDO WDQWLDOGDWDVHWLVUHTXLUHGDQGLVRIWHQXQDYDLODEOH%HFDXVHRI KH/)(DSSURDFKKDYHEHHQH[SORUHG,QWKLVUHJDUG)UHHDQG ORSHG D VHULHV RI VXEVWLWXHQW FRQVWDQWV E\ UHODWLQJ ELRORJLFDO XUHDQGIUHTXHQF\RIRFFXUUHQFHRIVSHFLÀFIXQFWLRQDOJURXSV OH7KLVPHWKRGRORJLFDODSSURDFKLVUHÁHFWHGE\WKHIROORZLQJ

$FWLYLW\ $7L7M*LM;LM DV WKH DYHUDJH ELRORJLFDO DFWLYLW\ IRU WKH VHULHV *LM LV WKH W\RIDIXQFWLRQDOJURXSLLQWKHMWKSRVLWLRQDQG;LMWKHSUHVHQFH RIWKHIXQFWLRQDOJURXSLLQWKHMWKSRVLWLRQ DSSURDFK UHTXLUHV WKDW WKH FRQWULEXWLRQV RI VXEVWLWXHQWV EH IÀFLHQWO\ODUJHGDWDEDVHEHDYDLODEOHWRIDFLOLWDWHWKHGHWHUPL XWLRQRIYDULRXVVXEVWLWXHQWV,QWKHSKDUPDFRNLQHWLFDUHQDWKH 6$5V KDYH EHHQ GHYHORSHG IRU WKH UDWH RI RUDO DEVRUSWLRQ

LGHV6H\GHODQG6FKDSHU%DVHGRQWKHIUDJPHQWFRQ SRVVLEOHWRSUHGLFWWKHFRPSRXQGZLWKWKHKLJKHVW.DRUDOYDOXH IUDJPHQWV ZLWK WKH KLJKHVW FRQWULEXWLRQV 0RUH UHFHQWO\ KPV IRU UHODWLQJ VWUXFWXUH WR 3%3. PRGHO SDUDPHWHUV IRU D HVLQUDWVKXPDQVDQGÀVKKDYHEHHQGHYHORSHG)RXFKpFRXUW

DQG.ULVKQDQ FHVVIXOO\ LQWHJUDWHG FKHPLFDOVLQWKHYDU LV WKDW WKH)UHH²:LO SUHGLFW WKH SDUDPHW VXEVWLWXHQWV6XFKD ELRORJLFDOO\EDVHGD

Biologically Base

&RQWUDU\WRWKHL GRQRWUHTXLUHDSULR ELRORJLFDOSURFHVVHV DQGDSUHGLFWLYHPD ORJLFDOGHWHUPLQDQWV SDUHG ZLWK H[SHULP SUHGLFWLRQRISDUDPH WKHDOJRULWKPLVEDV YDOXHVGLIIHUIURPH[ DQLVPVFDQEHJHQHU WLRQ7KHRUHWLFDOO\WK HWHUUHJDUGOHVVRI WK DSSOLFDWLRQRIVXFKD RIWKHPHFKDQLVWLFE PRGHOSDUDPHWHUV

$W WKH SUHVHQW W EDVHGDOJRULWKPVDUH PRGHO SDUDPHWHUV$ VHFWLRQ KDYH EHHQ RUJDQLFVXEVWDQFHV

IN SILICO A

In Silico Approac

7LVVXHDLU3&VG 92&VLQWLVVXHVDQ LFDOO\EDVHGDOJRULWK ÀFLHQWVRIDYDULHW\

,QGHYHORSLQJD/ HUVE\REVHUYH DK\GURSKRELFGHVF IRUIXQFWLRQDOO\VXEV FRUUHFWO\XQG

)RXFKpFRXUW HW DO 7KHVH DOJRULWKPVZHUH WKHQ VXF LQWRD3%3.PRGHO LQRUGHU WRVLPXODWH WKHNLQHWLFVRI WKHVH LRXVVSHFLHV7KHOLPLWLQJIDFWRURIVXFKDQDSSURDFKKRZHYHU VRQPRGHOGHYHORSHG IRU FKORURHWKDQHV FRXOGQRWEHXVHG WR HU YDOXHV IRU FKHPLFDOV ODFNLQJ WKH FRPPRQ VWUXFWXUH DQG OLPLWDWLRQFDQEHRYHUFRPHZLWKWKHGHYHORSPHQWDQGXVHRI OJRULWKPV

d Algorithms

QVLOLFRPHWKRGVGHVFULEHGDERYHELRORJLFDOO\EDVHGDOJRULWKPV ULNQRZOHGJHRIH[SHULPHQWDOGDWD+HUHLQIRUPDWLRQRQVSHFLÀF WKDWGHWHUPLQHWKHPDJQLWXGHRID3%3.SDUDPHWHULVJDWKHUHG WKHPDWLFDO UHODWLRQVKLSEHWZHHQ WKH3%3.SDUDPHWHUDQGELR LVGHYHORSHG7KHSUHGLFWLRQVRI WKHDOJRULWKPDUH WKHQFRP HQWDO GDWD IRU YDOLGDWLRQ SXUSRVHV 8QFHUWDLQW\ UHJDUGLQJ WKH WHUYDOXHVIRUGHQRYRFRPSRXQGVLVVRPHZKDWUHGXFHGEHFDXVH HGRQNQRZQELRORJLFDOPHFKDQLVPV,QFDVHVZKHUHSUHGLFWHG SHULPHQWDOGDWDK\SRWKHVHVFRQFHUQLQJRWKHUSODXVLEOHPHFK DWHGDQGLQFRUSRUDWHGZLWKLQWKHDOJRULWKPIRUIXUWKHUYHULÀFD HVHW\SHVRIDOJRULWKPVFDQEHGHYHORSHGIRUDQ\3%3.SDUDP HFKHPLFDOFODVVRUPROHFXODUVWUXFWXUH7KHGHYHORSPHQWDQG OJRULWKPVLVRQO\OLPLWHGE\WKHFXUUHQWOHYHORIXQGHUVWDQGLQJ DVLVDQGSKHQRPHQDWKDWGHWHUPLQHWKHPDJQLWXGHRIWKH3%3.