Make It FAIR – Principles of Data Management

Make It FAIR – Principles of Data Management

November 24, 2019

  The term, FAIR, was officially coincided in 2016, which stands for findability, accessibility, interoperability, and reusability [1]. During my omics integration project, I realized the importance of practicing FAIR data management in scientific fields. In current days, not only the methods of storing data vary among different databases, but also the formats of datasets […]

What a baby eats

June 24, 2019

A long time ago, I learned in school about the human digestive system. I learned how the food I ate, composed of proteins, carbohydrates, and fat, would be broken into less complex structures: amino acids, peptides, simple sugars, and fatty acids, by the enzymes present in my digestive tract. Those simpler structures would then be […]

A look inside your gut: which bacteria are you selecting for?

June 3, 2019

Gut bacteria have co-evolved with animals for hundreds of thousands of years. There is, however, still potential for bacterial adaptation within each individual, according to recently published research (1). The researchers combined metagenomics with isolation, cultivation, and sequencing of  the bacterial content of 30 fecal samples to investigate how Bacteroides fragilis, a frequent commensal in […]

Probabilistic principal component analysis

May 30, 2019

[latexpage] In this blog, we will walk through a probabilistic formulation of the well-known technique of principal component analysis (PCA). The probabilistic PCA (PPCA) was introduced by Bishop [1]. PPCA does not generate better results than PCA, but it permits a broad range of future extensions. Besides, PPCA falls in the category of Bayesian models, […]

Targeting biofilms using quorum sensing inhibitors

May 23, 2019

Biofilms are the surface adhered aggregates of microorganisms embedded within a thick layer of exopolysaccharide (EPS) matrix. Bacteria in this polysaccharide matrix are well-organized structures that enable transport of nutrient and waste in-and-out of the biofilm. Most of the time biofilms are comprised of several bacterial species. Due to protective shielding by the polysaccharide matrix, […]

Towards in silico enzyme design

May 8, 2019

What is enzyme design? Enzyme catalysis is a key process in a wide range of industries from pharmaceutical to food sciences [1-6]. In evolutionary context, nature has optimized enzymes in living organisms to adapt to specific niches over millions, if not billions, of years. Despite the constant evolution, these enzymes might not meet our modern-day […]

The essence and applications of word2vec technique

March 14, 2019

The word2vec technique has become an essential part when building a text model and even adapted in other fields like building a recommendation system. In this blog, I will introduce the basic concepts and applications of word2vec. When building a machine learning model to understand text, the first challenge is to encode the text as […]

Applications of Metabolic Flux Analysis (MFA)

February 15, 2019

Metabolic Flux Analysis (MFA) is a technique used to quantify metabolic fluxes (the rate of turnover of molecules). MFA has at least two important applications: First, by studying metabolic flux, it is possible to adjust the amount of substrates/ingredients in the medium/protocol of cell culture [1] or mutate genes [2] to improve productivity. Second, by […]

Effect of the chromosomal position on the expression of recombinant protein in microbes

February 8, 2019

Large-scale recombinant protein production is one of the most significant achievements of modern biotechnology. These proteins have wide applications in molecular biology, therapeutics, and industry. Efficient recombinant protein production using genetically manipulated organisms have saved several lives by providing the pure and accessible amount of therapeutic and prophylactic proteins. Today, more than 75 recombinant proteins […]

Antibiotic use in livestock farming: How much progress have we done?

January 10, 2019

The threat of antibiotic resistance is well documented and frequently highlighted in the media, with constant mentions to “superbugs” and the potential end of effective antimicrobials. In the United States alone, over 2 million people become infected with antibiotic-resistant bacteria, and the death toll can reach 23,000 annually (1). In Europe, the estimations are that […]

Feature selection in machine learning

December 14, 2018

The concept of “garbage in–garbage out” applies when building a data-driven predictive model.  The time for training a model increase exponentially with the number of features. Too many features also becomes a hurdle when attempting to select meaningful features. Moreover, the risk of overfitting increases as the number of features increases given limited number of […]

What is knowledge graph completion?

December 7, 2018

Short answer: Knowledge graph completion is the act of inferring new edges, called facts, in a knowledge graph based on the already existing relational data. Understanding this will require explanation of two concepts, notably 1) what knowledge graph is, and 2) what knowledge graph completion is.  For clarity, use of the term “knowledge graph” here is […]

Is this…a joke to you?

November 30, 2018

From comedy shows to everyday conversation, humor plays an important part in human interaction, whether it’s used to expedite courtship or enhance social bonding. And yet, if asked why we find certain things funny many of us will often struggle to express the reason. For something so prevalent in our daily lives, it is certainly […]

Challenges in transfer learning in biological data

November 18, 2018

A well-defined transfer learning approach is desirable in computational biology field, as it allows us to learn better in a specific field by utilizing the knowledge in related fields. However, this approach is not trivial at all, and so far there are some notable challenges in applying transfer learning in biological data: Different biological features […]

Sensible Machine Learning

November 2, 2018

There are many choices and assumptions to make when designing a machine learning (ML) based system. Taking the common choice is appealing but can undermine your system performance.  Having  recently designed an ML based system for prediction of gene expression (GE) [1], we made various uncommon but sensible choices and assumptions given the particular problem […]

Heterogeneous partition of multidrug efflux pump and emergence of antibiotic resistance

October 26, 2018

Emergence of antibiotic resistance bacteria is one major problem of the modern era. Bacteria are developing resistance against antibiotics at alarming rate and we are running out of effective antibiotics. Bacteria deploy several short- and long-term adaptive strategies to counteract the effect of potent antibiotics. Short-term strategies include pumping out drugs using multi drug efflux […]

Are we sending life to Mars?

October 4, 2018

The search for life outside planet Earth has been the topic of research of many scientists over the years. Human space exploration and the imminent possibility of actually colonizing other planets (being Mars our best candidate) raise concerns by a portion of these scientists, since we could, unintentionally, contaminate space with our Earth microbes. Multiple […]

Numerical stability of binary cross entropy loss and the log-sum-exp trick

September 26, 2018

[latexpage] When training a binary classifier, cross entropy (CE) loss is usually used as squared error loss cannot distinguish bad predictions from extremely bad predictions. The CE loss is defined as follows: $$L_{CE}(y,t) = tlogy + (1-t)log(1-y)$$ where $y$ is the probability of the sample falling in the positive class $(t=1)$. $y = \sigma(z)$, where […]

Sparsity in Machine Learning

September 10, 2018

One of the concepts that can improve effectiveness of a machine learning (ML) method, is the consideration of sparsity in its design. Here I give a short summary on benefits of sparsity considerations in ML. Definition: A set of numbers (e.g. vector, matrix, etc.), is considered sparse when a high percentage of the values are […]

Lab-grown meat is on the way!

September 10, 2018

Lab-grown meat, which is known more popularly as clean meat, synthetic meat, or in vitro meat, is prepared in an experimental lab using the animal tissue culture approaches. Lab-grown meat has recently gained a lot of attention because it can significantly reduce the greenhouse emissions and save energy. A study published by the Scientists of […]

From Ecoli to Salmonella: Transfer learning application in E.coli data

September 7, 2018

Imagine your computer learning much more efficiently from related well-known knowledge fields and applying these concepts to lesser-known knowledge fields: from identifying dogs to identifying cats [1], to playing Age of Empires to playing Starcraft, or from reading articles written in English to reading articles written in Spanish. Transfer learning attempts to solve these kinds […]

Navigating definitions: Antimicrobials

July 28, 2018

Scientists use jargon and abbreviations all the time, and communication of science to a broad audience, which can also include scientists from diverse backgrounds and fields, can be jeopardized if such terms are not explained and defined in a simpler, accessible way. Even beginner scientists can struggle when reading scientific papers or navigating a new […]

Hierarchical Active Transfer Learning

July 28, 2018

In my last post, we explored a machine learning method known as transfer learning used to counter problems in which two models have similar purposes but data is more readily available for one than the other. By exploiting similarities in data, transfer learning allows us to leverage one model to increase accuracy in the other, […]

Targeted experiments by optimal experiment design

July 12, 2018

In the era of big data, how to select which experiments to conduct to fill the gap in the data at hand is becoming a challenge. The selected experiments should be targeted such that they bring in the most amount of information related to the problem of interest. By conducting targeted experiments, we will reduce the redundancy […]

Salmonella: A bacteria with incredible diversity and application

July 8, 2018

Salmonella is a bacteria that can cause typhoid fever and diarrhea in humans and animals. For each year, 16 million cases of typhoid fever are reported and cause 600,000 deaths.[1] It also caused more than 15% of infection cases in the U.S. each year.[2] People spent several billion dollars annually to try to prevent Salmonella […]

How to Evaluate a Binary Classifier?

July 5, 2018

It is a common task to evaluate a binary classifier in the context of machine learning. In this blog, we will talk about what metrics to use for evaluating a binary classifier; the reason for the need of multiple metrics; how to plot a receiver operating characteristic (ROC) curve, how a random classifier works, and […]

Deep Learning Applied to Biology

June 16, 2018

I wanted to quickly take a moment to reflect on the work that our team has completed so far. We are in the midst of exciting state of the art research that fits at the intersection of deep learning and biological research. A few areas that we are currently involved in: • Optimal experimental design […]

Transfer learning – an Omics Prediction Solution?

June 15, 2018

Of late, training deep neural networks on tasks has become increasingly easier. Whether the task is image recognition, natural language processing, or gene expression prediction, if one has a large amount of data, one can train a model to map inputs to outputs with very high accuracy. The question subsequently becomes, how much data is […]

Research trends: Microbiome

June 8, 2018

Research in biomedical sciences has it’s trends as in many areas of science. I’ve become interested in microbiome research recently and might stay on it for number of years. As computational biologics, I wanted to quantitatively examine the growth in microbiome research. So I used ISI Web of Science. I performed two independent searches from […]

Beating the bad bugs using antimicrobial peptides

June 8, 2018

Emergence of antibiotic resistant bacteria at alarming rate is one of the major ongoing catastrophes. Most antibiotics are failing to control these bacteria, and now the emergence of “Super Bug” have created serious health-safety concerns. Researchers around the world have been trying to develop alternate methods to control the microbial infections. Antimicrobial peptides (AMPs) have […]

Triclosan: from hero to villain

June 7, 2018

For the fans of antimicrobial soaps, the determination by the Food and Drug Administration (FDA) to remove triclosan from antimicrobial soaps by the end of last year was an unpleasant surprise [1]. The rule was motivated by extensive research in the field suggesting harmful effects from triclosan exposure in animals and the development of antimicrobial […]