>[Interesting – two presenters! This is their undergraduate project]
Bioinformatics looking for genes activated by thiamine, using transcription factor binding motifs. [Some biological background] Thi1 and Thi5 binding sites are being detected.
Thiamine uptake causes repression of Thi1 and Thi5.
Used upstream sequences from genes of interest. Used motif detection tools to generate a dataset of potential sites.
Looking at Zinc finger TF’s, so bipartite, palindromic sites. Used BioProspector, from Stanford. It did what they wanted the best.
Implemented a pattern recognition network (feed forward), using training sets from bioprospector + negative (random) controls. Did lots of gene sets, many trials and tested many different parameters.
Used 3 different gene sets (nmt1 and nmt2 gene sets from different species), (gene set from s. Pombe only, 6 genes), (all gene sets all species)
Preliminary results: used length of 21, Train on S. pombe and S. japonicus, test on S. octosporus.
Results seem very good for first attempt. Evaluation with “confusion matrix” seems very good. (Accuracy appears to be in the range of 86-95%)
Final testing with the neural network: Significant findings will be verified biologically, and knockout strains may be tested with microarrays.