The concept of “garbage in–garbage out” applies when building a data-driven predictive model. The time for training a model increase exponentially with the number of features. Too many features also becomes a hurdle when attempting to select meaningful features. Moreover, the risk of overfitting increases as the number of features increases given limited number of […]
Short answer: Knowledge graph completion is the act of inferring new edges, called facts, in a knowledge graph based on the already existing relational data. Understanding this will require explanation of two concepts, notably 1) what knowledge graph is, and 2) what knowledge graph completion is. For clarity, use of the term “knowledge graph” here is […]
From comedy shows to everyday conversation, humor plays an important part in human interaction, whether it’s used to expedite courtship or enhance social bonding. And yet, if asked why we find certain things funny many of us will often struggle to express the reason. For something so prevalent in our daily lives, it is certainly […]
A well-defined transfer learning approach is desirable in computational biology field, as it allows us to learn better in a specific field by utilizing the knowledge in related fields. However, this approach is not trivial at all, and so far there are some notable challenges in applying transfer learning in biological data: Different biological features […]
There are many choices and assumptions to make when designing a machine learning (ML) based system. Taking the common choice is appealing but can undermine your system performance. Having recently designed an ML based system for prediction of gene expression (GE) [1], we made various uncommon but sensible choices and assumptions given the particular problem […]
Emergence of antibiotic resistance bacteria is one major problem of the modern era. Bacteria are developing resistance against antibiotics at alarming rate and we are running out of effective antibiotics. Bacteria deploy several short- and long-term adaptive strategies to counteract the effect of potent antibiotics. Short-term strategies include pumping out drugs using multi drug efflux […]
The search for life outside planet Earth has been the topic of research of many scientists over the years. Human space exploration and the imminent possibility of actually colonizing other planets (being Mars our best candidate) raise concerns by a portion of these scientists, since we could, unintentionally, contaminate space with our Earth microbes. Multiple […]
[latexpage] When training a binary classifier, cross entropy (CE) loss is usually used as squared error loss cannot distinguish bad predictions from extremely bad predictions. The CE loss is defined as follows: $$L_{CE}(y,t) = tlogy + (1-t)log(1-y)$$ where $y$ is the probability of the sample falling in the positive class $(t=1)$. $y = \sigma(z)$, where […]
One of the concepts that can improve effectiveness of a machine learning (ML) method, is the consideration of sparsity in its design. Here I give a short summary on benefits of sparsity considerations in ML. Definition: A set of numbers (e.g. vector, matrix, etc.), is considered sparse when a high percentage of the values are […]
Lab-grown meat, which is known more popularly as clean meat, synthetic meat, or in vitro meat, is prepared in an experimental lab using the animal tissue culture approaches. Lab-grown meat has recently gained a lot of attention because it can significantly reduce the greenhouse emissions and save energy. A study published by the Scientists of […]
Imagine your computer learning much more efficiently from related well-known knowledge fields and applying these concepts to lesser-known knowledge fields: from identifying dogs to identifying cats [1], to playing Age of Empires to playing Starcraft, or from reading articles written in English to reading articles written in Spanish. Transfer learning attempts to solve these kinds […]
Scientists use jargon and abbreviations all the time, and communication of science to a broad audience, which can also include scientists from diverse backgrounds and fields, can be jeopardized if such terms are not explained and defined in a simpler, accessible way. Even beginner scientists can struggle when reading scientific papers or navigating a new […]
In my last post, we explored a machine learning method known as transfer learning used to counter problems in which two models have similar purposes but data is more readily available for one than the other. By exploiting similarities in data, transfer learning allows us to leverage one model to increase accuracy in the other, […]
In the era of big data, how to select which experiments to conduct to fill the gap in the data at hand is becoming a challenge. The selected experiments should be targeted such that they bring in the most amount of information related to the problem of interest. By conducting targeted experiments, we will reduce the redundancy […]
Salmonella is a bacteria that can cause typhoid fever and diarrhea in humans and animals. For each year, 16 million cases of typhoid fever are reported and cause 600,000 deaths.[1] It also caused more than 15% of infection cases in the U.S. each year.[2] People spent several billion dollars annually to try to prevent Salmonella […]
It is a common task to evaluate a binary classifier in the context of machine learning. In this blog, we will talk about what metrics to use for evaluating a binary classifier; the reason for the need of multiple metrics; how to plot a receiver operating characteristic (ROC) curve, how a random classifier works, and […]
I wanted to quickly take a moment to reflect on the work that our team has completed so far. We are in the midst of exciting state of the art research that fits at the intersection of deep learning and biological research. A few areas that we are currently involved in: • Optimal experimental design […]
Of late, training deep neural networks on tasks has become increasingly easier. Whether the task is image recognition, natural language processing, or gene expression prediction, if one has a large amount of data, one can train a model to map inputs to outputs with very high accuracy. The question subsequently becomes, how much data is […]
Research in biomedical sciences has it’s trends as in many areas of science. I’ve become interested in microbiome research recently and might stay on it for number of years. As computational biologics, I wanted to quantitatively examine the growth in microbiome research. So I used ISI Web of Science. I performed two independent searches from […]
Emergence of antibiotic resistant bacteria at alarming rate is one of the major ongoing catastrophes. Most antibiotics are failing to control these bacteria, and now the emergence of “Super Bug” have created serious health-safety concerns. Researchers around the world have been trying to develop alternate methods to control the microbial infections. Antimicrobial peptides (AMPs) have […]
For the fans of antimicrobial soaps, the determination by the Food and Drug Administration (FDA) to remove triclosan from antimicrobial soaps by the end of last year was an unpleasant surprise [1]. The rule was motivated by extensive research in the field suggesting harmful effects from triclosan exposure in animals and the development of antimicrobial […]