The term, FAIR, was officially coincided in 2016, which stands for findability, accessibility, interoperability, and reusability [1]. During my omics integration project, I realized the importance of practicing FAIR data management in scientific fields. In current days, not only the methods of storing data vary among different databases, but also the formats of datasets […]
A long time ago, I learned in school about the human digestive system. I learned how the food I ate, composed of proteins, carbohydrates, and fat, would be broken into less complex structures: amino acids, peptides, simple sugars, and fatty acids, by the enzymes present in my digestive tract. Those simpler structures would then be […]
Gut bacteria have co-evolved with animals for hundreds of thousands of years. There is, however, still potential for bacterial adaptation within each individual, according to recently published research (1). The researchers combined metagenomics with isolation, cultivation, and sequencing of the bacterial content of 30 fecal samples to investigate how Bacteroides fragilis, a frequent commensal in […]
[latexpage] In this blog, we will walk through a probabilistic formulation of the well-known technique of principal component analysis (PCA). The probabilistic PCA (PPCA) was introduced by Bishop [1]. PPCA does not generate better results than PCA, but it permits a broad range of future extensions. Besides, PPCA falls in the category of Bayesian models, […]
Biofilms are the surface adhered aggregates of microorganisms embedded within a thick layer of exopolysaccharide (EPS) matrix. Bacteria in this polysaccharide matrix are well-organized structures that enable transport of nutrient and waste in-and-out of the biofilm. Most of the time biofilms are comprised of several bacterial species. Due to protective shielding by the polysaccharide matrix, […]
What is enzyme design? Enzyme catalysis is a key process in a wide range of industries from pharmaceutical to food sciences [1-6]. In evolutionary context, nature has optimized enzymes in living organisms to adapt to specific niches over millions, if not billions, of years. Despite the constant evolution, these enzymes might not meet our modern-day […]
The word2vec technique has become an essential part when building a text model and even adapted in other fields like building a recommendation system. In this blog, I will introduce the basic concepts and applications of word2vec. When building a machine learning model to understand text, the first challenge is to encode the text as […]
Metabolic Flux Analysis (MFA) is a technique used to quantify metabolic fluxes (the rate of turnover of molecules). MFA has at least two important applications: First, by studying metabolic flux, it is possible to adjust the amount of substrates/ingredients in the medium/protocol of cell culture [1] or mutate genes [2] to improve productivity. Second, by […]
Large-scale recombinant protein production is one of the most significant achievements of modern biotechnology. These proteins have wide applications in molecular biology, therapeutics, and industry. Efficient recombinant protein production using genetically manipulated organisms have saved several lives by providing the pure and accessible amount of therapeutic and prophylactic proteins. Today, more than 75 recombinant proteins […]
The threat of antibiotic resistance is well documented and frequently highlighted in the media, with constant mentions to “superbugs” and the potential end of effective antimicrobials. In the United States alone, over 2 million people become infected with antibiotic-resistant bacteria, and the death toll can reach 23,000 annually (1). In Europe, the estimations are that […]
The concept of “garbage in–garbage out” applies when building a data-driven predictive model. The time for training a model increase exponentially with the number of features. Too many features also becomes a hurdle when attempting to select meaningful features. Moreover, the risk of overfitting increases as the number of features increases given limited number of […]
Short answer: Knowledge graph completion is the act of inferring new edges, called facts, in a knowledge graph based on the already existing relational data. Understanding this will require explanation of two concepts, notably 1) what knowledge graph is, and 2) what knowledge graph completion is. For clarity, use of the term “knowledge graph” here is […]
From comedy shows to everyday conversation, humor plays an important part in human interaction, whether it’s used to expedite courtship or enhance social bonding. And yet, if asked why we find certain things funny many of us will often struggle to express the reason. For something so prevalent in our daily lives, it is certainly […]
A well-defined transfer learning approach is desirable in computational biology field, as it allows us to learn better in a specific field by utilizing the knowledge in related fields. However, this approach is not trivial at all, and so far there are some notable challenges in applying transfer learning in biological data: Different biological features […]
There are many choices and assumptions to make when designing a machine learning (ML) based system. Taking the common choice is appealing but can undermine your system performance. Having recently designed an ML based system for prediction of gene expression (GE) [1], we made various uncommon but sensible choices and assumptions given the particular problem […]
Emergence of antibiotic resistance bacteria is one major problem of the modern era. Bacteria are developing resistance against antibiotics at alarming rate and we are running out of effective antibiotics. Bacteria deploy several short- and long-term adaptive strategies to counteract the effect of potent antibiotics. Short-term strategies include pumping out drugs using multi drug efflux […]
The search for life outside planet Earth has been the topic of research of many scientists over the years. Human space exploration and the imminent possibility of actually colonizing other planets (being Mars our best candidate) raise concerns by a portion of these scientists, since we could, unintentionally, contaminate space with our Earth microbes. Multiple […]
[latexpage] When training a binary classifier, cross entropy (CE) loss is usually used as squared error loss cannot distinguish bad predictions from extremely bad predictions. The CE loss is defined as follows: $$L_{CE}(y,t) = tlogy + (1-t)log(1-y)$$ where $y$ is the probability of the sample falling in the positive class $(t=1)$. $y = \sigma(z)$, where […]
One of the concepts that can improve effectiveness of a machine learning (ML) method, is the consideration of sparsity in its design. Here I give a short summary on benefits of sparsity considerations in ML. Definition: A set of numbers (e.g. vector, matrix, etc.), is considered sparse when a high percentage of the values are […]
Lab-grown meat, which is known more popularly as clean meat, synthetic meat, or in vitro meat, is prepared in an experimental lab using the animal tissue culture approaches. Lab-grown meat has recently gained a lot of attention because it can significantly reduce the greenhouse emissions and save energy. A study published by the Scientists of […]
Imagine your computer learning much more efficiently from related well-known knowledge fields and applying these concepts to lesser-known knowledge fields: from identifying dogs to identifying cats [1], to playing Age of Empires to playing Starcraft, or from reading articles written in English to reading articles written in Spanish. Transfer learning attempts to solve these kinds […]
Scientists use jargon and abbreviations all the time, and communication of science to a broad audience, which can also include scientists from diverse backgrounds and fields, can be jeopardized if such terms are not explained and defined in a simpler, accessible way. Even beginner scientists can struggle when reading scientific papers or navigating a new […]
In my last post, we explored a machine learning method known as transfer learning used to counter problems in which two models have similar purposes but data is more readily available for one than the other. By exploiting similarities in data, transfer learning allows us to leverage one model to increase accuracy in the other, […]
In the era of big data, how to select which experiments to conduct to fill the gap in the data at hand is becoming a challenge. The selected experiments should be targeted such that they bring in the most amount of information related to the problem of interest. By conducting targeted experiments, we will reduce the redundancy […]
Salmonella is a bacteria that can cause typhoid fever and diarrhea in humans and animals. For each year, 16 million cases of typhoid fever are reported and cause 600,000 deaths.[1] It also caused more than 15% of infection cases in the U.S. each year.[2] People spent several billion dollars annually to try to prevent Salmonella […]
It is a common task to evaluate a binary classifier in the context of machine learning. In this blog, we will talk about what metrics to use for evaluating a binary classifier; the reason for the need of multiple metrics; how to plot a receiver operating characteristic (ROC) curve, how a random classifier works, and […]
I wanted to quickly take a moment to reflect on the work that our team has completed so far. We are in the midst of exciting state of the art research that fits at the intersection of deep learning and biological research. A few areas that we are currently involved in: • Optimal experimental design […]
Of late, training deep neural networks on tasks has become increasingly easier. Whether the task is image recognition, natural language processing, or gene expression prediction, if one has a large amount of data, one can train a model to map inputs to outputs with very high accuracy. The question subsequently becomes, how much data is […]
Research in biomedical sciences has it’s trends as in many areas of science. I’ve become interested in microbiome research recently and might stay on it for number of years. As computational biologics, I wanted to quantitatively examine the growth in microbiome research. So I used ISI Web of Science. I performed two independent searches from […]
Emergence of antibiotic resistant bacteria at alarming rate is one of the major ongoing catastrophes. Most antibiotics are failing to control these bacteria, and now the emergence of “Super Bug” have created serious health-safety concerns. Researchers around the world have been trying to develop alternate methods to control the microbial infections. Antimicrobial peptides (AMPs) have […]
For the fans of antimicrobial soaps, the determination by the Food and Drug Administration (FDA) to remove triclosan from antimicrobial soaps by the end of last year was an unpleasant surprise [1]. The rule was motivated by extensive research in the field suggesting harmful effects from triclosan exposure in animals and the development of antimicrobial […]