ABSTRACT
VRI,Q6LOLFR$SSURDFKHV H0RGHOV OVRQ7\SH0RGHOV DVHG$OJRULWKPV IRU3%3.0RGHO3DUDPHWHUV DFKHVIRU7LVVXH$LU3DUWLWLRQ&RHIÀFLHQWV DFKHVIRU%ORRG$LU3&V DFKHVIRU7LVVXH%ORRG3&V DFKHVIRU3URWHLQ%LQGLQJ DFKHVIRU&OHDUDQFH&RQVWDQWV DFKHVIRU6NLQ3HUPHDELOLW\&RQVWDQWV DFKHVIRU2UDO$EVRUSWLRQ&RQVWDQWV $SSURDFKHVLQWR5LVN$VVHVVPHQW 6$5VIRU&KORURHWKDQHV H²:LOVRQ46$5VLQWR3%3.0RGHOV LVN$VVHVVPHQWRI0HWK\O&KORURIRUP XUH'LUHFWLRQV
3K\VLRORJLFDOO\ XVHGIRUFRQGXFWLQJ DQGH[SRVXUHVFHQDU 3%3.PRGHO WKH SURFHVVHV RI DE RUJDQLVP.ULVKQDQ FRQVWLWXWLQJ WKH 3% SDUDPHWHUVQDPHO\ DOYHRODU YHQWLODWLRQ VXHEORRGSDUWLWLRQF DQG ELRFKHPLFDO P :KHUHDV WKH LQIRUP ELRPHGLFDO OLWHUDWXUH SK\VLFRFKHPLFDOSD ÀFLHQWDQGELRFKHP RIPHWDEROLVPDQG
7KH SK\VLFRFKH FKHPLFDOVSHFLÀF 3 DSSURDFKHV ,Q YLYR H[SRVHGDQLPDOVDQ PRGHO VLPXODWLRQV PLVVLQJ SDUDPHWHUV HVWLPDWLQJ RQH RU W VWXGLHVSDUWLFXODUO\ RIDQLPDOV.ULVKQD
7KHLQYLWURPHW XVHIXORQO\IRUHVWLP FRQVWDQWV FDQQRW E IUHVKO\LVRODWHGKHS SURPLVH.ULVKQDQD DVZHOO DV UHGXFWLRQ RWKHU DOWHUQDWLYH DS 3%3. PRGHO SDUDP FRQWH[W7KHÀUVWRQ HWHUV LQRUGHU WRGH WKH PDJQLWXGH RI WK TXDQWLWDWLYH VWUXFWX DSSURDFKLQYROYHVWK VWDQGLQJRIWKHLQWHU QDQWV LQ RUGHU WR S REMHFWLYHV RI WKLV DSSURDFKHV46$5V SDUDPHWHUVDQG LQKXPDQKHDOWKULV
INTRODUCTION
EDVHGSKDUPDFRNLQHWLF 3%3.PRGHOV DUH LQFUHDVLQJO\EHLQJ GRVHH[WUDSRODWLRQVURXWHH[WUDSRODWLRQVVSHFLHVH[WUDSRODWLRQV LRH[WUDSRODWLRQVUHTXLUHGIRUULVNDVVHVVPHQWV$QGHUVHQHWDO VDUHEDVLFDOO\PHFKDQLVPEDVHGPDWKHPDWLFDOGHVFULSWLRQVRI VRUSWLRQ GLVWULEXWLRQ PHWDEROLVP DQG H[FUHWLRQ LQ WKH LQWDFW DQG$QGHUVHQ7KHDOJHEUDLFDQGGLIIHUHQWLDOHTXDWLRQV 3. PRGHOV DUH VROYHG ZLWK WKH NQRZOHGJH RI YDULRXV LQSXW SK\VLRORJLFDOWLVVXHYROXPHVEORRGÁRZUDWHVFDUGLDFRXWSXW UDWH SK\VLFRFKHPLFDO EORRGDLU SDUWLWLRQ FRHIÀFLHQWV WLV RHIÀFLHQWVDEVRUSWLRQUDWHFRQVWDQWVSHUPHDELOLW\FRHIÀFLHQWV D[LPDO YHORFLW\ 0LFKDHOLV DIÀQLW\ FRQVWDQW SDUDPHWHUV DWLRQ RQ SK\VLRORJLFDO SDUDPHWHUV FDQ EH REWDLQHG IURP WKH $UPV DQG7UDYLV WKLV LV IUHTXHQWO\ QRW WKH FDVH IRU UWLWLRQFRHIÀFLHQWVDEVRUSWLRQFRQVWDQWVDQGSHUPHDELOLW\FRHI LFDOSDUDPHWHUVKHSDWLFRUUHQDOFOHDUDQFHVPD[LPDOYHORFLW\ 0LFKDHOLVDIÀQLW\FRQVWDQW PLFDO DQG ELRFKHPLFDO SDUDPHWHUV QHHGHG IRU FRQVWUXFWLQJ %3. PRGHOV FDQ EH REWDLQHG XVLQJ LQ YLYR RU LQ YLWUR DSSURDFKHV LQYROYH FROOHFWLRQ RI SKDUPDFRNLQHWLF GDWD LQ GDQDO\VLVRIVXFKGDWDXVLQJD3%3.PRGHO%\DGMXVWLQJWKH WRPDWFK WKH H[SHULPHQWDO GDWD WKH QXPHULFDO YDOXHV RI WKH FDQ EH HVWLPDWHG 6XFK D SURFHGXUH LV UHOLDEO\ DSSOLHG IRU ZR SDUDPHWHUV DW D WLPH 3DUDPHWHU HVWLPDWLRQ XVLQJ LQ YLYR IRUQRQYRODWLOHVFDQEHWHGLRXVDQGFDQUHTXLUHH[WHQVLYHXVH QDQG$QGHUVHQ KRGVIDFLOLWDWLQJUHGXFHGDQLPDOXVHKDYHEHHQSURYHQWREH DWLQJSDUWLWLRQFRHIÀFLHQWV7KH LQYLWURGHULYHGPHWDEROLVP
H GLUHFWO\ LQFRUSRUDWHG ZLWKLQ 3%3. PRGHOV HYHQ WKRXJK DWRF\WHVDQGSRVWPLWRFKRQGULDOIUDFWLRQVDSSHDUWRKROGVRPH QG$QGHUVHQ7KHFRQVLGHUDWLRQVRIFRVWHIIHFWLYHQHVV UHSODFHPHQW RI DQLPDO XVH KDYH OHG WR WKH GHYHORSPHQW RI SURDFKHV SDUWLFXODUO\ LQ VLOLFR DSSURDFKHV IRU HVWLPDWLQJ HWHUV 7ZR NLQGV RI LQ VLOLFR DSSURDFKHV DUH XVHIXO LQ WKLV HLQYROYHVWKHXVHRIDYDLODEOHGDWDIRUYDULRXV3%3.SDUDP YHORSHTXDWLRQV WKDWDVVRFLDWHFKDUDFWHULVWLFVRIFKHPLFDOV WR H SDUDPHWHUV$Q H[DPSOH RI WKLV FDWHJRU\ LV WKH FODVVLFDO UHDFWLYLW\ UHODWLRQVKLS 46$5 DSSURDFK $QRWKHU LQ VLOLFR HGHYHORSPHQWRIPHFKDQLVWLFDOJRULWKPVEDVHGRQDQXQGHU UHODWLRQVKLSVDPRQJFHUWDLQELRORJLFDODQGFKHPLFDOGHWHUPL
UHGLFW WKH QXPHULFDO YDOXH RI 3%3.PRGHO SDUDPHWHUV 7KH FKDSWHU DUH WR UHYLHZ WKH VWDWHRIWKH DUW RI LQ VLOLFR ELRORJLFDOO\EDVHGDOJRULWKPV IRUHVWLPDWLQJ3%3.PRGHO WRLOOXVWUDWHKRZWKHLQVLOLFR²EDVHG3%3.PRGHOVFDQEHXVHG NDVVHVVPHQWDSSOLFDWLRQV
METHODO
7KH IROORZLQJ S EDVLVRI WKH WZR W\S EDVHGDOJRULWKPV
QSARs
7KH46$5VW\SLF LQWKLVFRQWH[W WRVW IXQFWLRQI
6LQFHHPSLULFDOG DERYH UHODWLRQVKLS OLQHDUPXOWLOLQHDUR WKH YDOXH RI 3%3. )UHH²:LOVRQPRGHOV
LFE-Type Models
/)(W\SHPRGHOV RIFKHPLFDOVWUXFWXUH )XMLWD7KHED HQFHV LQ PDJQLWXGH FRUUHVSRQG WR FKDQJ WKHGLIIHUHQFH LQELR OLQHDUO\ UHODWHG WKH IUHHHQHUJ\UHODWLRQV ELRORJLFDOV\VWHPVLW (+DQGHQWURS\( GHVFULSWRUVWKDWFDQE 7KHFRHIÀFLHQWVIRUW XVLQJVWDQGDUGVWDWLVW FDQEHEURDGO\FODVV ELF /)( PRGHOV FD GHVFULSWRUVEDVHGRQ
Electrostatic Featu
(OHFWURQLFHIIHFW FLHVSDUWLDODWRPLFF VLJPDWYDOXHVUHV DQG 7DIW VXEVWLWXHQW WKURXJKELRORJLFDOP
LOGICAL BASIS OF IN SILICO APPROACHES
DUDJUDSKV SURYLGH D EULHI GHVFULSWLRQ RI WKHPHWKRGRORJLFDO HVRI LQ VLOLFR DSSURDFKHVQDPHO\46$5VDQGELRORJLFDOO\
DOO\UHODWHDELRORJLFDODFWLYLW\RUPRUHVSHFLÀFDOO\DSURSHUW\ UXFWXUDOIHDWXUHVVSHFLÀFWRFKHPLFDOVWKURXJKDPDWKHPDWLFDO
%LRORJLFDOSURSHUW\ IVWUXFWXUDOIHDWXUH
DWDDUHXVHGWRGHULYHWKHPDWKHPDWLFDOIXQFWLRQIXOÀOOLQJWKH GHSHQGLQJ XSRQ WKH QDWXUH RI WKH GDWD WKH IXQFWLRQV FDQ EH UVXSUDOLQHDU7ZRW\SHVRI46$5VKDYHEHHQXVHGWRHVWLPDWH PRGHO SDUDPHWHUV OLQHDUIUHH HQHUJ\ /)( PRGHOV DQG
DUHTXDQWLWDWLYHUHODWLRQVKLSVWKDWGHVFULEHDFWLYLW\DVDIXQFWLRQ UHO\LQJXSRQWKHSULQFLSOHVRIWKHUPRG\QDPLFV+DQVFKDQG VLVRIWKHFRPPRQO\XVHG+DQVFKDSSURDFKLVWKDWWKHGLIIHU RI D JLYHQ ELRORJLFDO DFWLYLW\ ZLWKLQ D VHULHV RI FKHPLFDOV HV LQ WKH IUHH HQHUJ\ (* GXULQJ WKH SURFHVVHV LQYROYHG$V ORJLFDODFWLYLW\DQG WKHFKDQJH LQIUHHHQHUJ\DUH OLNHO\ WREH UHVXOWLQJPDWKHPDWLFDO UHODWLRQVKLSVDUH UHIHUUHG WRDV ´OLQHDU KLSVµ%HFDXVH LW LVYHU\GLIÀFXOW WRGLUHFWO\GHWHUPLQH(* LQ VWKHUPRG\QDPLFFRPSRQHQWVVXFKDVWKHHQHUJ\((HQWKDOS\ 6DUHXVHGLQVWHDGDQGDUHUHSUHVHQWHGE\DVHULHVRIVWUXFWXUDO HGHULYHGIRUDQ\JLYHQPROHFXOH6H\GHODQG6FKDSHU KHVHGHVFULSWRUVLHVORSHVDQGLQWHUFHSWVDUHWKHQUHJUHVVHG LFDOWHFKQLTXHV,QWKLVW\SHRIDSSURDFKVWUXFWXUDOGHVFULSWRUV LÀHGLQWRWKUHHJHQHUDOW\SHVHOHFWURVWDWLFVWHULFRUK\GURSKR Q LQFRUSRUDWH RQH RU PDQ\ RI WKHVH FDWHJRULHV RI VWUXFWXUDO WKHVWDWLVWLFDOVLJQLÀFDQFHRIHDFKIHDWXUHLQWKHÀQDOPRGHO
res in LFE-Type Models
VW\SLFDOO\LQFOXGHHOHFWURQGRQDWLQJDQGZLWKGUDZLQJWHQGHQ KDUJHVDQGHOHFWURVWDWLFÀHOGGHQVLWLHVDVGHÀQHGE\+DPPHWW RQDQFHSDUDPHWHUV5YDOXHVLQGXFWLYHSDUDPHWHUV)YDOXHV YDOXHV V W (V %HFDXVH LRQL]HGPROHFXOHV FDQQRW SDVV HPEUDQHVDQGHOHFWURVWDWLFHIIHFWVFRQVWDQWVDUHGHULYHGIURP
LRQL]DWLRQFKDUDFWHUL FKHPLFDOVWRRQO\HO HUVE\XVHGGL IRU UHODWLQJ VWUXFWXU FRHIÀFLHQWVRIDVHUL NLGQH\PXVFOHEUDLQ GXH WR LQFUHDVLQJ OL UHOHYDQWRQO\IRUIXQ 'LFNLQV KDY LRQL]DWLRQSRWHQWLDO DQGELQGLQJWR&<3 GHVFULSWRUVHVSHFLDO ELQGLQJVLWHRIWKHFR
Steric Features in
6WHULFHIIHFWVDUH WLYLW\DQGWKH7DIWVWH QHVVµRIWKHPROHFXO DUHDFDUERQFKDLQEU IURP WKUHHGLPHQVLR DOWKRXJKWKHUHODWLRQ DVLQWXLWLYHLQFHUWDL IRUWKHVHIHDWXUHVRIW
*DUJDVHWDO PRGHOSDUDPHWHUYDO WLYLW\LQGLFHVDQGDG SURYLGHG EHWWHU GHV GHVFULSWRUV DORQH 8 EHWZHHQWKHVHGHVFUL PDQQHU )RU H[DPS VWUXFWXUDODQGHOHFWU UHSUHVHQWPXOWLSOHVX SRXQGVÁH[LELOLW\LQ RWKHUGHVFULSWRUXVHG SUHVHQWLQWKHPROHF LQWHUHVW*DUJDVHWDO YHORFLW\ RI PHWDERO FRQWDLQHGLQWKHFRQ WKDWPRUHDFFXUDWHP WRUVDQGPRUHVSHFLÀ
Hydrophobic Featu
+\GURSKRELFIHD WKHORJRFWDQROZDWH
VWLFVRIWKHPROHFXOHUHODWLQJSKDUPDFRNLQHWLFEHKDYLRURIDOO HFWURVWDWLFIHDWXUHVLVQRWWRWDOO\UHOHYDQW$EUDKDPDQG:HDWK SRODULW\SRODUL]DELOLW\DPRQJRWKHUVDVHOHFWURVWDWLFGHVFULSWRUV H WR WKH QXPHULFDO YDOXHV RI WLVVXHDLU DQG EORRGDLU SDUWLWLRQ HVRIFKHPLFDOV7KH\REVHUYHGWKDWZDWHUSODVPDEORRGOXQJ IDWDQGROLYHRLOSURJUHVVLYHO\EHFDPHOHVVGLSRODUSRODUL]DEOH SLG FRQWHQW WR 7KXV WKH HOHFWURVWDWLF GHVFULSWRU LV FWLRQDOO\VXEVWLWXWHGFRPSRXQGVVXFKDVSURSDQRO/HZLVDQG H VKRZQ WKH LPSRUWDQFH RI HOHFWURVWDWLF GHVFULSWRUV VXFK DV DQGS.DLQUHODWLQJVWUXFWXUHRIYDULRXVGUXJVWRPHWDEROLFUDWHV &RUUHODWLRQVZHUHLPSURYHGZLWKWKHLQFRUSRUDWLRQRIWKHVH O\IRU LRQL]DEOHRUSRODUGUXJV7KLV LVSUREDEO\UHODWHGWRWKH PSRXQGRQWKH&<3SURWHLQZKLFKFRQWDLQVSRODUDPLQRDFLGV
LFE-Type Models
FRQYHQWLRQDOO\UHSUHVHQWHGE\YDOXHVFDOFXODWHGIRUPRODUUHIUDF ULFSDUDPHWHU+RZHYHUVLQFHVWHULFHIIHFWVGHVFULEHWKH´EXONL HWKH\FDQLQFOXGHPROHFXODUYROXPHPROHFXODUZHLJKWVXUIDFH DQFKLQJHWF0ROHFXODUFRQQHFWLYLW\LQGLFHVDQGIHDWXUHVGHULYHG QDO 46$5 FDQ DOVR EH FRQVLGHUHG DV EHLQJ RI D VWHULF QDWXUH VKLSEHWZHHQVWUXFWXUHDQGWKHVHIHDWXUHVLVRIWHQREVFXUHDQGQRW QLQVWDQFHV)XUWKHUPRUHREWDLQLQJWKHFKHPLFDOVSHFLÀFYDOXHV HQUHTXLUHVWKHXVHRIVSHFLDOL]HGFKHPLFDOPRGHOLQJVRIWZDUH XVHGVWHULFGHVFULSWRUVLQRUGHUWRUHODWHVWUXFWXUHWR3%3. XHVDQGWKH\VXJJHVWHGWKDWD/)(HTXDWLRQFRPELQLQJFRQQHF KRFGHVFULSWRUVVXFKDVWKHQXPEHURIKDORJHQVLQDFRPSRXQG FULSWLRQV IRU WLVVXHDLU SDUWLWLRQ FRHIÀFLHQWV WKDQ XVLQJ HLWKHU VH RI FRQQHFWLYLW\ LQGLFHV LV OLPLWHG EHFDXVH WKH UHODWLRQVKLS SWRUVDQGVWUXFWXUHLVQRWLQIRUPDWLYHLQDWUDQVSDUHQWDQGGLUHFW OH WKH ÀUVW RUGHU YDOHQFH FRQQHFWLYLW\ LQGH[ UHSUHVHQWV ERWK RQLF IHDWXUHVRIDFRPSRXQG LQDFRPSOH[ZD\2UGHU LQGLFHV EVWLWXWLRQSDWWHUQVRI WKHKDORJHQVRQWKHFDUERQV LQ WKHFRP SRO\PHUVRUKDORJHQVXEVWLWXWLRQSDWWHUQV0RUHLQWXLWLYHLVWKH E\*DUJDVHWDOQDPHO\WKHQXPEHURIKDORJHQDWRPV XOHDOWKRXJKLWPD\QRWDOZD\VEHUHOHYDQWIRUDOOPROHFXOHVRI DOVRXVHGVWHULFSDUDPHWHUVWRUHODWHVWUXFWXUHWRPD[LPDO LVP 9PD[ EXW EHFDXVH RI WKH OHYHO RI HOHFWURQLF LQIRUPDWLRQ QHFWLYLW\LQGLFHVXVHGLQWKHGHVFULSWLRQVWKHDXWKRUVVXJJHVWHG RGHOLQJRI9PD[FRXOGEHDWWHPSWHGE\XVLQJERWKVWHULFGHVFULS FHOHFWURQLFLQIRUPDWLRQVXFKDVFKDUJHGLVWULEXWLRQ
res in LFE-Type Models
WXUHVLQ/)(W\SHHTXDWLRQVDUHIUHTXHQWO\UHSUHVHQWHGE\XVLQJ USDUWLWLRQFRHIÀFLHQWORJ3RZRUWKHK\GURSKRELFSDUDPHWHU
TZKLFKLVGHULYHG >3ZD@ RLODLU >3RD@ VROXELOLW\ SDUDPHWHU RFWDQRODLURLODLURU 3%3. PRGHO SDUDP PXOWLSOHVSHFLHVUDW YDOXHV IRU3RZ3RD 3& UHSUHVHQW GLVWU ZDWHUDQGDLU LW LV SDUWLWLRQPHDVXUHIRU VLQFH WLVVXH LV FRPS GHVFULSWRUVKDYHEHHQ HUVE\5HFHQW RIWKHOLWHUDWXUHIRXQ DQDO\VLVXVLQJ3ZD3 HVVHQWLDOO\ WKH VDPH LPSRUWDQFHRIWLVVXH
Free-Wilson Type
$OWKRXJKWKHUHOD ELFGHWHUPLQDQWVKDY VKLSV XVLQJ RWKHU G )XUWKHUPRUHLW LVRI UHODWH WR3.SURFHVV FRPELQDWLRQVDVXEV WKLVDOWHUQDWLYHVWRW :LOVRQ GHYH DFWLYLW\ZLWKWKHQDW LQWKHSDUHQWPROHFX HTXDWLRQ
ZKHUH $ LV GHÀQHG FRQWULEXWLRQWRDFWLYL RUDEVHQFH
7KH )UHH²:LOVRQ DGGLWLYHDQGWKDWDVX QDWLRQRIWKHFRQWULE )UHH²:LOVRQ W\SH 4 .DRUDORIVXOIRQDP VWDQWVGHULYHGLWZDV E\ FRPELQLQJ WKH )UHH²:LOVRQ DOJRULW VHULHVRIFKORURHWKDQ
IURP3RZ+RZHYHURWKHUSDUWLWLRQFRHIÀFLHQWVHJZDWHUDLU RLOZDWHU QKH[DGHFDQHDLU >3KHD@ SDUWLWLRQ FRHIÀFLHQWV DQG V KDYH DOVR EHHQ XVHG +\GURSKRELF SDUDPHWHUV QDPHO\ ZDWHUDLUKDYHEHHQH[WHQVLYHO\XVHGIRUUHODWLQJVWUXFWXUHDQG HWHUV 7KH XVH RI YDULRXV GDWDVHWV DQG H[SHULPHQWDO GDWD LQ KXPDQRUÀVKLQWKHUHJUHVVLRQVKDYHOHGWRYDULHGFRHIÀFLHQW RU3ZD%HFDXVHEORRGDLU DQG WLVVXHDLUSDUWLWLRQFRHIÀFLHQWV LEXWLRQ EHWZHHQ D ELRORJLFDO PDWUL[ FRQVLVWLQJ RI OLSLG DQG ORJLFDO WKDW UHDVRQDEO\JRRGFRUUHODWLRQVDUHREWDLQHGXVLQJD DQRWKHUUHOHYDQWPDWUL[DQGDLULH3ZDRU3RD)XUWKHUPRUH RVHG RI ZDWHU OLSLGV DQG SURWHLQ WKH FRHIÀFLHQWV RI WKHVH VXJJHVWHGWRUHÁHFWWLVVXHFRPSRVLWLRQ$EUDKDPDQG:HDWK O\0HXOHQEHUJDQG9LMYHUEHUJDIWHUDQH[WHQVLYHUHYLHZ GWKDWYDOXHVRIWKHFRHIÀFLHQWVREWDLQHGIROORZLQJUHJUHVVLRQ
RDDQGWKHH[SHULPHQWDO3&YDOXHVIRUUDWVDQGKXPDQVZHUH DV WKH WLVVXH OLSLG ZDWHU DQG OLSLG FRQWHQW KLJKOLJKWLQJ WKH FRPSRVLWLRQLQWKHSDUWLWLRQLQJSURFHVV
Models
WLRQVKLSVEHWZHHQSKDUPDFRNLQHWLFSDUDPHWHUVDQGK\GURSKR HEHHQH[SORUHGIUHTXHQWO\WKHGHYHORSPHQWRIVXFKUHODWLRQ HWHUPLQDQWV LQ WKH /)( DSSURDFK LV QRW DV VWUDLJKWIRUZDUG WHQXQFOHDUKRZWKHVHGHWHUPLQDQWVDVXVHGLQ/)(HTXDWLRQV HVDQG LQRUGHU WRVXFFHVVIXOO\H[SORUHDOO UHOHYDQWVWUXFWXUDO WDQWLDOGDWDVHWLVUHTXLUHGDQGLVRIWHQXQDYDLODEOH%HFDXVHRI KH/)(DSSURDFKKDYHEHHQH[SORUHG,QWKLVUHJDUG)UHHDQG ORSHG D VHULHV RI VXEVWLWXHQW FRQVWDQWV E\ UHODWLQJ ELRORJLFDO XUHDQGIUHTXHQF\RIRFFXUUHQFHRIVSHFLÀFIXQFWLRQDOJURXSV OH7KLVPHWKRGRORJLFDODSSURDFKLVUHÁHFWHGE\WKHIROORZLQJ
$FWLYLW\ $7L7M*LM;LM DV WKH DYHUDJH ELRORJLFDO DFWLYLW\ IRU WKH VHULHV *LM LV WKH W\RIDIXQFWLRQDOJURXSLLQWKHMWKSRVLWLRQDQG;LMWKHSUHVHQFH RIWKHIXQFWLRQDOJURXSLLQWKHMWKSRVLWLRQ DSSURDFK UHTXLUHV WKDW WKH FRQWULEXWLRQV RI VXEVWLWXHQWV EH IÀFLHQWO\ODUJHGDWDEDVHEHDYDLODEOHWRIDFLOLWDWHWKHGHWHUPL XWLRQRIYDULRXVVXEVWLWXHQWV,QWKHSKDUPDFRNLQHWLFDUHQDWKH 6$5V KDYH EHHQ GHYHORSHG IRU WKH UDWH RI RUDO DEVRUSWLRQ
LGHV6H\GHODQG6FKDSHU%DVHGRQWKHIUDJPHQWFRQ SRVVLEOHWRSUHGLFWWKHFRPSRXQGZLWKWKHKLJKHVW.DRUDOYDOXH IUDJPHQWV ZLWK WKH KLJKHVW FRQWULEXWLRQV 0RUH UHFHQWO\ KPV IRU UHODWLQJ VWUXFWXUH WR 3%3. PRGHO SDUDPHWHUV IRU D HVLQUDWVKXPDQVDQGÀVKKDYHEHHQGHYHORSHG)RXFKpFRXUW
DQG.ULVKQDQ FHVVIXOO\ LQWHJUDWHG FKHPLFDOVLQWKHYDU LV WKDW WKH)UHH²:LO SUHGLFW WKH SDUDPHW VXEVWLWXHQWV6XFKD ELRORJLFDOO\EDVHGD
Biologically Base
&RQWUDU\WRWKHL GRQRWUHTXLUHDSULR ELRORJLFDOSURFHVVHV DQGDSUHGLFWLYHPD ORJLFDOGHWHUPLQDQWV SDUHG ZLWK H[SHULP SUHGLFWLRQRISDUDPH WKHDOJRULWKPLVEDV YDOXHVGLIIHUIURPH[ DQLVPVFDQEHJHQHU WLRQ7KHRUHWLFDOO\WK HWHUUHJDUGOHVVRI WK DSSOLFDWLRQRIVXFKD RIWKHPHFKDQLVWLFE PRGHOSDUDPHWHUV
$W WKH SUHVHQW W EDVHGDOJRULWKPVDUH PRGHO SDUDPHWHUV$ VHFWLRQ KDYH EHHQ RUJDQLFVXEVWDQFHV
IN SILICO A
In Silico Approac
7LVVXHDLU3&VG 92&VLQWLVVXHVDQ LFDOO\EDVHGDOJRULWK ÀFLHQWVRIDYDULHW\
,QGHYHORSLQJD/ HUVE\REVHUYH DK\GURSKRELFGHVF IRUIXQFWLRQDOO\VXEV FRUUHFWO\XQG
)RXFKpFRXUW HW DO 7KHVH DOJRULWKPVZHUH WKHQ VXF LQWRD3%3.PRGHO LQRUGHU WRVLPXODWH WKHNLQHWLFVRI WKHVH LRXVVSHFLHV7KHOLPLWLQJIDFWRURIVXFKDQDSSURDFKKRZHYHU VRQPRGHOGHYHORSHG IRU FKORURHWKDQHV FRXOGQRWEHXVHG WR HU YDOXHV IRU FKHPLFDOV ODFNLQJ WKH FRPPRQ VWUXFWXUH DQG OLPLWDWLRQFDQEHRYHUFRPHZLWKWKHGHYHORSPHQWDQGXVHRI OJRULWKPV
d Algorithms
QVLOLFRPHWKRGVGHVFULEHGDERYHELRORJLFDOO\EDVHGDOJRULWKPV ULNQRZOHGJHRIH[SHULPHQWDOGDWD+HUHLQIRUPDWLRQRQVSHFLÀF WKDWGHWHUPLQHWKHPDJQLWXGHRID3%3.SDUDPHWHULVJDWKHUHG WKHPDWLFDO UHODWLRQVKLSEHWZHHQ WKH3%3.SDUDPHWHUDQGELR LVGHYHORSHG7KHSUHGLFWLRQVRI WKHDOJRULWKPDUH WKHQFRP HQWDO GDWD IRU YDOLGDWLRQ SXUSRVHV 8QFHUWDLQW\ UHJDUGLQJ WKH WHUYDOXHVIRUGHQRYRFRPSRXQGVLVVRPHZKDWUHGXFHGEHFDXVH HGRQNQRZQELRORJLFDOPHFKDQLVPV,QFDVHVZKHUHSUHGLFWHG SHULPHQWDOGDWDK\SRWKHVHVFRQFHUQLQJRWKHUSODXVLEOHPHFK DWHGDQGLQFRUSRUDWHGZLWKLQWKHDOJRULWKPIRUIXUWKHUYHULÀFD HVHW\SHVRIDOJRULWKPVFDQEHGHYHORSHGIRUDQ\3%3.SDUDP HFKHPLFDOFODVVRUPROHFXODUVWUXFWXUH7KHGHYHORSPHQWDQG OJRULWKPVLVRQO\OLPLWHGE\WKHFXUUHQWOHYHORIXQGHUVWDQGLQJ DVLVDQGSKHQRPHQDWKDWGHWHUPLQHWKHPDJQLWXGHRIWKH3%3.