Check out the new site at learncheminformatics.com!
Screen Shot 2013-09-30 at 5.52.45 PM.png

The following is kept as an archive, but probably won't be updated regularly.

Welcome to the Indiana Cheminformatics Education Portal (ICEP). is a repository of freely accessible cheminformatics educational materials maintained by David Wild, director of the Indiana University Cheminformatics Program. The purpose of this repository is to broaden education in cheminformatics, an effort that includes several other initiatives including an innovative cheminformatics distance education program, a Graduate Certificate in Chemical Informatics from Indiana University, a low cost cheminformatics self-study guide, and a collaboration with Open Source Drug Discovery for sharing of cheminformatics educational materials. ICEP includes free introductory learning materials, videos, and news, social media, and latest journal articles relating to cheminformatics.


New!!

We now are running a free online introductory cheminformatics course - it's not too late to join.

ICEP's wiki-based introductory cheminformatics learning materials


The following introductory topics are adapted from materials that we use in our I571 Chemical Information Technology class, which serves as our introductory cheminformatics course at Indiana University. These themes are developed further in the Introducing Cheminformatics eBook


Free short tutorial videos





Introducing Cheminformatics eBook now available for $25 - also available in library and company versions



external image icbookcover.jpg?height=200&width=154New eBook! - Introducing Cheminformatics: an intensive self-study guide.
This Lulu PDF eBook gives an intensive introduction to cheminformatics, including the history of the field, representing 2D and 3D chemical structures on computer, storing and using databases of chemical and related biological information, handling chemical information on the web and in the scholarly literature, and giving an overview of some advanced topics such as clustering and diversity, QSAR and predictive modeling, 3D alignment and docking, and software toolkits cheminformatics software. It is aimed at life scientists and computer scientists in both industry and academia who need a rapid, flexible introduction to this field, and comes as a regular PDF. It is available in an individual version for $25, or in versions that can be used in an academic library system or company-wide. For more information, check out the eBook on Lulu.


Other learning resources




Cheminformatics news, blogs and social media




    So much to do, so little time
    • Looking for PowerApp Developers by Rajarshi Guha Sep 26, 2017
      On behalf of Prof. Debarchana Ghosh Looking for someone with expertise in developing and maintaining PowerApp applications that include queries to the Google Maps API. I have an app currently in production for which I need some help with modificat...
    • Waterfall Plots for Dose Response Curves by Rajarshi Guha Sep 20, 2017
      Waterfall plots are a common visualization method to view multiple spectra and have some similarities with joy plots. In the high throughput screening world, people have plot multiple dose response curves, offset on the z-axis to produce something...
    • Survey Software Developer Opportunity by Rajarshi Guha Mar 8, 2017
      On behalf of Prof. Debarchana Ghosh (UConn): I’m designing a survey to collect information about a person’s daily activities in their neighborhood and their social networks, i.e. the people they meet and talk to. Examples of daily...
    • CSA Trust Grant – Call for Proposals by Rajarshi Guha Feb 20, 2017
      Applications Invited for CSA Trust Grant for 2017 The Chemical Structure Association (CSA) Trust is an internationally recognized organization established to promote the critical importance of chemical information to advances in chemical research....
    • Endnote XML to HTML or LaTeX by Rajarshi Guha Dec 31, 2016
      Over the last few years I’ve been maintaining my publication list as a BibTeX file, managed by BibDesk. This is handy when writing papers, but it’s also useful to use this data to keep my CV updated or generate a publications...
    • Freedom from the IF: Impact Neutral Publishing by Rajarshi Guha Dec 26, 2016
      I came across a post from Jan Jensen a few months ago about a GRC meeting that he had attended. What caught my eye however, was his comment on “impact neutral” publishing. Specifically, he mentions For me “impact neutrality&rdquo...
    • Deep Learning in Chemistry by Rajarshi Guha Nov 8, 2016
      Deep learning (DL) is all the rage these days and this approach to predictive modeling is being applied to a wide variety of problems, including many in computational drug discovery. As a dilettante in the area of deep learning, I’ve been fo...
    • A Report from a Stranger in a Strange Land by Rajarshi Guha Oct 26, 2016
      I just got back from ACoP7, the yearly meeting of the International Society of Pharmacometrics (ISoP). Now, I don’t do any PK/PD modeling (hence the “strange land”) but was invited to talk about our high throughput screening plat...
    • From Algorithmic Fairness to QSAR Models by Rajarshi Guha Aug 8, 2016
      The topic of algorithmic fairness has started recieving a lot of attention due to the ability of predictive models to make decisions that might discriminate against certain classes of people. The reasons for this include biased training data, corr...
    • Database Licensing & Sustainability by Rajarshi Guha May 14, 2016
      Update (07/28/16): DrugBank/OMx have updated the licensing conditions for DrugBank data in response to concerns raised earlier by various people and groups. See here for a detailed response from Craig Knox A few days back I came across, via my Twi...

    chem-bla-ics
    • Two conference proceedings: nanopublications and Scholia by Egon Willighagen Oct 15, 2017
      The nanopublication conference article inScholia.It takes effort to move scholarly publishing forward. And the traditional publishers have not all shown to be good at that: we're still basically stuck with machine-broken channels like PDFs and Rea...
    • CDK used in SIRIUS 3: metabolomics tools from Germany by Egon Willighagen Oct 8, 2017
      Screenshot from the SIRIUS 3 Documentation.License: unknown.It has been ages I blogged about work I heard about and think should receive more attention. So, I'll try to pick up that habit again.After my PhD research (about machine learning (chemom...
    • new paper: "The future of metabolomics in ELIXIR" by Egon Willighagen Oct 4, 2017
      CC-BY from F1000 article.This spring I attended a meeting organized by researchers from the European metabolomics community, including from PhenoMeNal to talk about proposing a use case to ELIXIR. Doing research in metabolomics and being part...
    • New paper: "RDFIO: extending Semantic MediaWiki for interoperable biomedical data management" by Egon Willighagen Sep 9, 2017
      Figure 10 from the article showing what the DrugMet wikiwith the pKa data looked like. CC-BY.When I was still doing research at Uppsala University, I had a internship student, Samuel Lampa, who did wonderful work on knowledge representation and lo...
    • DataCite: the PubMed for data and software by Egon Willighagen Aug 27, 2017
      We have services like PubMed, Europe PMC, and Google Scholar to make a list of literature. Scholia/Wikidata and ORCID are upcoming services, but for data and software there are fewer options. One notable exception is DataCite (two past blogs ...
    • Updated HMDB identifier scheme by Egon Willighagen Aug 26, 2017
      I have not found further details about it yet, but noticed half an hour ago that the Human Metabolome Database (doi:10.1093/nar/gks1065) seems to have changes all their identifiers: the added extra zeros. The screenshot for D-fructose on the ...
    • What about postprint servers? by Egon Willighagen Aug 26, 2017
      Various article version types, including pre and post.Source: SHERPA/ROMEO.Now that preprint servers are picking up speed, let's talk about postprint servers. Sure, we have plenty of places to place and find discussions about the content of articl...
    • Text mining literature that mention JRC representative nanomaterials by Egon Willighagen Aug 17, 2017
      The week before a short holiday in France (nature, cycling, hiking, touristic CERN visit; thanks to Philippe for the ViaRhone tip!), I did some further work on contentmining literature that mention the JRC representative nanomaterials. One im...
    • Wikidata visualizes SMILES strings with John Mayfield's CDK Depict by Egon Willighagen Jul 30, 2017
      SVG depiction of D-ribulose.Wikidata is building up a curated collection of information about chemicals. A lot of data originates from Wikipedia, but active users are augmenting this information. Of particular interest, in this respect, is Se...
    • new paper: "A transcriptomics data-driven gene space accurately predicts liver cytopathology and drug-induced liver injury" by Egon Willighagen Jul 5, 2017
      Figure from the article. CC-BY.One of the projects I worked on at Karolinska Institutet with Prof. Grafström was the idea of combining transcriptomics data with dose-response data. Because we wanted to know if there was a relation betwee...

    Useful Chemistry
    • Matthew McBride wins Nov 2012 ONS Challenge Award by Jean-Claude Bradley Nov 16, 2012
      Matthew McBride, an undergraduate chemistry major at Drexel University working in the Bradley Laboratory, was awarded the November 2012 Open Notebook Science Challenge Award sponsored by the Royal Society of Chemistry.  ChemSpider founder Ant...
    • MiniSymposium Bradley Lab 2011 by Jean-Claude Bradley Oct 5, 2011
      I recently presented a 15 minute summary of the current research in my lab on September 29, 2011 at the Drexel University Department of Chemistry Faculty Mini-Symposium. The main project discussed was the Open Melting Point Collection done in coll...
    • Patrick Ndungu talk at Drexel on Nanotechnology by Jean-Claude Bradley Aug 18, 2011
      One of my former Ph. D. students, Patrick Ndungu (now at University of KwaZulu Natal, South Africa) will be speaking at Drexel University on Friday August 19, 2011 at 12:30 in Disque 109. Some Interesting Perspectives on the Integration of Nanoma...
    • Google Apps Scripts Workshop at Drexel University by Jean-Claude Bradley Aug 17, 2011
      Andrew Lang will be in Philadelphia next week and we will be running a workshop on Leveraging Google Spreadsheets with Scripts for Research and Teaching. Now that our institution is no longer providing Microsoft Office for students in the fall te...
    • Open Melting Point Collection Book Edition 1 by Jean-Claude Bradley Aug 11, 2011
      Several months of work through a collaboration between myself, Andrew Lang, Antony Williams and Evan Curtin have culminated in the publication of an Open Melting Point Collection Book. Like our other books on solubility and Reaction Attempts, the...
    • Rapid analysis of melting point trends and models using Google Apps Scripts by Jean-Claude Bradley Jul 19, 2011
      I recently reported on how Google Apps Scripts can be used to facilitate the recording and calculations associated with a chemistry laboratory notebook. (also see resource page)I will demonstrate here how these scripts can be used to rapidly disco...
    • Practical Tips on using Google Apps Scripts for Chemistry Applications by Jean-Claude Bradley Jul 14, 2011
      A few weeks ago I described our use of Google Apps Scripts, developed by Rich Apodaca and Andrew Lang, as an intuitive interface to information related to a chemistry laboratory notebook. Since then we have been using these tools to actively plan...
    • Open Notebook Science Talk at HUBbub 2011 by Jean-Claude Bradley Jul 1, 2011
      On April 6, 2011 I presented at the HUBzero Conference in Indianapolis on "Open Notebook Science: Does Transparency Work?".This presentation will first describe Open Notebook Science, the practice of making the laboratory notebook and all associat...
    • The 4-benzyltoluene melting point twist by Jean-Claude Bradley Jun 22, 2011
      Evan Curtin and I were in the lab this morning to follow up on our effort to curate the melting point of 4-benzyltoluene. I identified the next step to confirm an upper limit of -15 C:With the information available thus far from our experiments (...
    • Google Apps Scripts for an intuitive interface to organic chemistry Open Notebooks by Jean-Claude Bradley Jun 18, 2011
      Rich Apodaca recently demonstrated how Google Apps Scripts can be added to Google Spreadsheets to enable simple calling of web services for chemistry applications (gChem). Although we have been using web service calls from within a Google spreads...

    Noel O'Blog
    • Open Babel in a snap by Noel O'Boyle Oct 1, 2017
      Maintainers of Linux distributions do an amazing job packaging applications for users, so that "apt/dnf install openbabel"  makes software packages available for use within seconds. The only downside is that the version of the software may be...
    • How many cheminformaticians does it take to read a SMILES string? by Noel O'Boyle Sep 23, 2017
      A few days ago I posted the following poll question on Twitter:How many Hs are on the N in the molecule described by this SMILES string? N(C)(C)(C)CI provided four possible answers:ZeroOneDependsCan't say as no such moleculePlease take a moment to...
    • My ACS talk on Kekulization and aromatic SMILES by Noel O'Boyle Aug 28, 2017
      Here are the slides for the talk I presented last week at the ACS meeting in Washington. It describes my understanding of the Daylight toolkit as deduced by John: We need to talk about Kekulization, Aromaticity and SMILES from baoilleach The fu...
    • Faster toolkit, faster! by Noel O'Boyle Jul 23, 2017
      After an extended hiatus, I've been back doing a bit of work on Open Babel, and specifically I've been working on improving performance. Basic reading and writing of molecules should be limited by disk I/O speed and not CPU, but this is not (yet) ...
    • Using WebLogo3 to create a sequence logo from Python by Noel O'Boyle May 5, 2017
      A sequence logo is a way to display the variation at particular positions of multiple-aligned sequences. Here I used WebLogo 3 to do this. The documentation is not great, even once you find it, and working out how to do my specific use case requir...
    • Whiskas statistics and the pitfalls of mean rank by Noel O'Boyle Jan 23, 2017
      In the fingerprint similarity paper I published last year, I made the following observation: "However the use of the mean rank is itself problematic as the pairwise similarity of two methods can be altered (and even inverted) by adding additional ...
    • Counting hydrogens in a SMILES string - The Rules by Noel O'Boyle Jan 14, 2017
      Hydrogens are not usually listed explicitly in a SMILES string, but instead can be inferred from a set of rules. To be precise, when reading a SMILES, the position of every hydrogen is known unambiguously once you know the rules. Oh - did I forget...
    • The clockwisdom of SMARTS by Noel O'Boyle Dec 30, 2016
      In earlier posts I discussed/investigated how stereochemistry is represented in SMILES. Here I'm going to try to figure out what I thought would be relatively simple, how to write a SMARTS pattern that matches a chiral molecule and all of its supe...
    • Open Babel 2.4.0 released by Noel O'Boyle Sep 23, 2016
      As announced by Geoff on the mailing list, Open Babel 2.4.0 is now available to download:I'm pleased to announce that Open Babel 2.4.0 has finally been released.This release represents a major update and should be a stable upgrade, strongly recomm...
    • My new thing - providing manuscript images as PDFs by Noel O'Boyle Aug 21, 2016
      My latest oeuvre (on the topic of which fingerprint is best) was published by J. Cheminf. a few weeks ago. For the first time, instead of providing the images as PNGs, I submitted them as PDFs.You see, John had worked me over. At the start, I thou...

    petermr's blog
    • CopyCamp2017 4: What is (Responsible) ContentMining? by pm286 Sep 26, 2017
      My non-profit organization contentmine.org has the goal of making contentmining universally available to everyone through three arms: Advocacy. Why it's so valuable and why you should convince others and why restrictions should be removed. Communi...
    • CopyCamp2017 3: The Hague Declaration and why ContentMining is important by pm286 Sep 26, 2017
      In 2015 LIBER (The European body for Research Libraries) collected a number of leading figures in the Library and Scholarship world to create the Hague Declaration on freedom for Text and Data Mining. This stated not only the aspirations but &hell...
    • CopyCamp 2: workshop on ContentMining - what is it and how to do it by pm286 Sep 26, 2017
      In the last post I explained why I became interested in contentmining to do scientific research and started to explain how it it is still a major political and legal challenge. I am excited that I have been asked to … Continue reading →
    • CopyCamp: why Copyright reform has failed TDM / ContentMining - 1 The vision and the tragedy by pm286 Sep 25, 2017
      I am honoured to have been invited to speak at CopyCamp2017,  "The Internet of Copyrighted Things" .  I've not been to CopyCamp before, but I've been to similar events and I'm delighted to see it is sponsored by organisations, some &hell...
    • WLIC/IFLA2017: UBER for scholarly communications and libraries? It’s already here… by pm286 Aug 21, 2017
      WLIC/IFLA2017: UBER for scholarly communications and libraries? It’s already here… You all know of the digital revolution that is changing the world of service - Amazon, UBER, AirBnB, coupled to Facebook, Google, Siri, etc. The common...
    • ContentMine at IFLA2017: The future of Libraries and Scholarly Communications by pm286 Aug 21, 2017
      ContentMine at IFLA2017: The future of Libraries and Scholarly Communications   I am delighted to have been invited to talk at IFLA (https://www.ifla.org/annual-conference), the global overarching body for Libraries of all sorts. I’m in...
    • What is TextAndData/ContentMining? by pm286 Jul 11, 2017
      What is TextAndData/ContentMining? I prefer “ContentMining” to the formal legal phrase “Text and Data Mining” because it emphasizes all kinds of content - audio, photos, videos, diagrams, chemistry, etc. I chose it to asser...
    • Text and Data Mining: Overview by pm286 Jul 11, 2017
      Text and Data Mining: Overview Tomorrow The University of Cambridge Office of Scholarly Communication is running a 1-day Symposium on Text and Data Mining (https://docs.google.com/document/d/1l4N2fSFgpL3iMbjKC3IxHz7GpNVvERB5NzxqWp8jZQo/edit ). I h...
    • How Wikidata can change the world of scientific information 1/n by pm286 Nov 5, 2016
      We're getting involved in Wikidata! It will change the world of scientific (and other) information. So here is an emerging conversation, hopefully over several blog posts. Wicki: Hang on! What's Wikidata? And Wikimedia? I've heard of Wikipedia, bu...
    • The critical role of e-Theses: award acceptance speech at NDLTD by pm286 Jul 12, 2016
      I am honoured by this award; I ‘ll describe the current struggle for ownership of digital scholarly knowledge, emphasize young people and machine-understandable theses and suggest practices.   Early Career Researchers see the digital li...



        Most Recent Articles: Journal of Cheminformatics
        • Chemotion ELN: an Open Source electronic lab notebook for chemists in academia by Pierre Tremouilhac, An Nguyen, Yu-Chieh Huang, Serhii Kotov, Dominic Sebastian Lütjohann, Florian Hübsch, Nicole Jung and Stefan Bräse Sep 24, 2017
          The development of an electronic lab notebook (ELN) for researchers working in the field of chemical sciences is presented. The web based application is available as an Open Source software that offers modern ...
        • Erratum to: The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching by Egon L. Willighagen, John W. Mayfield, Jonathan Alvarsson, Arvid Berg, Lars Carlsson, Nina Jeliazkova, Stefan Kuhn, Tomáš Pluskal, Miquel Rojas-Chertó, Ola Spjuth, Gilleain Torrance, Chris T. Evelo, Rajarshi Guha and Christoph Steinbeck Sep 19, 2017
        • Scoria: a Python module for manipulating 3D molecular data by Patrick Ropp, Aaron Friedman and Jacob D. Durrant Sep 17, 2017
          Third-party packages have transformed the Python programming language into a powerful computational-biology tool. Package installation is easy for experienced users, but novices sometimes struggle with depende...
        • A review of parameters and heuristics for guiding metabolic pathfinding by Sarah M. Kim, Matthew I. Peña, Mark Moll, George N. Bennett and Lydia E. Kavraki Sep 14, 2017
          Recent developments in metabolic engineering have led to the successful biosynthesis of valuable products, such as the precursor of the antimalarial compound, artemisinin, and opioid precursor, thebaine. Synth...
        • G.A.M.E.: GPU-accelerated mixture elucidator by Alioune Schurz, Bo-Han Su, Yi-Shu Tu, Tony Tsung-Yu Lu, Olivia A. Lin and Yufeng J. Tseng Sep 14, 2017
          GPU acceleration is useful in solving complex chemical information problems. Identifying unknown structures from the mass spectra of natural product mixtures has been a desirable yet unresolved issue in metabo...

        Journal of Chemical Information and Modeling: Latest Articles (ACS Publications)

        Other resources



        Way to put counter on site
        Site Stats