Spring Batch + Karp Rabin = How Crispr Cas 9 Works

Is there a mathematical model or informatic one to describe how Crispr Cas9 works? I don't know.

I share with you my theoretical Model on How Crispr Cas 9 Works.

Spring Batch + Karp Rabin = How Crispr Cas 9 Works

Vulgarization Spring Batch + Karp Rabin = How Crispr Cas9 works

                            Author : Wadï Mami 

                        Publication Date : 07/06/2021

                     Email : wmami@steg.com.tn

 Wadï Mami Theory on explaining how Could Spring Batch + karp rabin explain how Crispr Cas9 Works.

CRISPR-Cas9 

CRISPR-Cas9 was adapted from a naturally occurring genome editing system in bacteria. The bacteria capture snippets of DNA from invading viruses and use them to create DNA segments known as CRISPR arrays.The CRISPR arrays allow the bacteria to "remember" the viruses (or closely related ones). If the viruses attack again, the bacteria produce RNA segments from the CRISPR arrays to target the viruses' DNA. The bacteria then use Cas9 or a similar enzyme to cut the DNA apart, which disables the virus.


Spring Batch reads and process DNA sequentially until reaching commit-interval value then it writes transformed items (DNA) simultaneously.

Spring Batch uses a 'Chunk Oriented' processing style within its most common implementation. Chunk oriented processing refers to reading the data one at a time, and creating 'chunks' that will be written out, within a transaction boundary. One item is read in from an ItemReader, handed to an ItemProcessor, and aggregated. Once the number of items read equals the commit interval


, the entire chunk is written out via the ItemWriter, and then the transaction is committed.

Below is a code representation of the same concepts shown above:

List items = new Arraylist();
for(int i = 0; i < commitInterval; i++){
    Object item = itemReader.read()
    Object processedItem = itemProcessor.process(item);
    items.add(processedItem);
}
itemWriter.write(items);

Spring Batch is the bacteria.

The bacteria capture snippets of DNA from invading viruses and use them to create DNA segments known as CRISPR

 arrays

 example private String dna_pattern = "AATTCC"; //snippets of DNA from invading viruses in

 https://github.com/didipostman/CrisprCas9/blob/main/src/main/java/com/juxtapose/example/ch02/DNA_SequenceProcessor.java

<=> Spring Batch read DNA file or DNA database , The DNA file or the DNA database are Viruses DNA.

 SpringBatch read() --->ItemReader and ItemReader return item. <=> Spring Batch process() ----> ItemProcessor and

 return transformed item = DNA segments known as CRISPR arrays Here I use DNA_sequenceProcessor class that

 implements ItemProcessor and uses Karp Rabin (you can use other DNA pattern recognition algorithm)

 https://github.com/didipostman/CrisprCas9/blob/main/src/main/java/com/juxtapose/example/ch02/DNA_SequenceProcessor.java 

The CRISPR arrays allow the bacteria to "remember" the viruses (or closely related ones). If the viruses attack again,

 the bacteria produce RNA segments from the CRISPR arrays to target the viruses' DNA example private String

 dna_pattern = "AATTCC"; in

 https://github.com/didipostman/CrisprCas9/blob/main/src/main/java/com/juxtapose/example/ch02/DNA_SequenceProcessor.java 

The bacteria then use Cas9 or a similar enzyme to cut the DNA apart, which disables the virus.

 <=> Spring batch write(transformed items) ----> ItemWriter ( cut Virus DNA ). 

Conclusion

 Spring Batch + Karp Rabin = how CRISPR Cas9 works is my IT theoretical model may be it could be interesting and useful for drugs discovery. The model is an idea that had been haunting me since 2012. I share it with you. I can’t go further with it, may be you find it useful interesting and continue developement. The model is under MIT License

 https://github.com/didipostman/CrisprCas9


If the Theory model is wrong or unseful or uninteresting read below there is always something to win from this idea

Abstract :

Processing large volume of data has always been a major problem due to the increasing volume of the data. Batch processing can be applied in many use cases. Among them why not Pattern Matching for DNA Sequencing Data. In this article, I am going to demonstrate batch processing using one of the projects of Spring which is Spring Batch. Spring Batch provides functions for processing large volumes of data in batch jobs. In our case reading DNA file or database table and seeking for patterns I mean all the locations of the specified pattern inside a DNA sequence.

Spring batch to process huge data : Spring Batch is a lightweight, comprehensive batch framework designed to enable the development of robust batch applications vital for the daily operations of enterprise systems.

A step is an object that encapsulates a sequential phase of a job and holds all the necessary information to define and control processing. It delegates all the

    <step id="dnaSeqStep">

        <tasklet transaction-manager="transactionManager">

            <chunk reader="csvItemReader" writer="csvItemWriter"

                processor="DNA_SequenceProcessor" commit-interval="2">

            </chunk>

        </tasklet>

    </step>

Configuring ItemReader We will now define ItemReader for our model which will be used for

reading data from CSV file.

                    <bean:property name="resource"

                        value="classpath:ch02/data/DNA.csv"/>

                    <bean:property name="lineMapper">

                        <bean:bean

                            class="org.springframework.batch.item.file.mapping.DefaultLineMapper">

                            <bean:property name="lineTokenizer" ref="lineTokenizer"/>

                            <bean:property name="fieldSetMapper">

                                <bean:bean class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper">

                                    <bean:property name="prototypeBeanName" value="DNA_Sequence">

                                    </bean:property>

                                </bean:bean>

                            </bean:property>

                        </bean:bean>

                    </bean:property>

                </bean:bean>

                <!-- lineTokenizer -->

                <bean:bean id="lineTokenizer"

                    class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">

                    <bean:property name="delimiter" value=","/>

                    <bean:property name="names">

                        <bean:list>

                            <bean:value>dna</bean:value>

                                           <bean:value>crissprArrays</bean:value>



                        </bean:list>

                    </bean:property>

                </bean:bean>

As you can see I use a DNASequence_Processor class that implements itemProcessor and use Karp Rabin Algorithm.

ItemWriter

                    <bean:property name="resource" value="file:target/ch02/outputFile.csv"/>

                    <bean:property name="lineAggregator">

                        <bean:bean

                            class="org.springframework.batch.item.file.transform.DelimitedLineAggregator">

                            <bean:property name="delimiter" value="|"></bean:property>

                            <bean:property name="fieldExtractor">

                                <bean:bean

                                    class="org.springframework.batch.item.file.transform.BeanWrapperFieldExtractor">

                                    <bean:property name="names"

                                         value="dna, seqDNA_Arrays">

                                    </bean:property>

                                </bean:bean>

                            </bean:property>

                        </bean:bean>

                    </bean:property>

                </bean:bean>

Conclusion

This article just scratched the surface of Spring Batch in general. The example used in this article is not production-ready code. You can define job configuration depending on your project requirements.

Here The Github repository for the project

github.com/didipostman/SBKarpRabin

DNA is a sequence of letters such as A, C, G, T. Searching for specific sequences is often difficult due to measurement errors, mutations or evolutionary alterations. Thus, similarity of two sequences using Levenshtein Distance is more useful than exact matches.

So instead of Karp Rabin we will use Levenshtein Distance or Jaro_Winkler_Similarity by using

Package org.apache.commons.text.similarity commons.apache.org/proper/commons-text/apid..

Nano Robots could be guided by

+ Karp Rabin (or Levenshtein Distance or JaroWinkler Similarity) it is just a suggestion I am not an expert a #nanotechnology #java #developer then you can cure diseses like #cancer #diabetes etc


The Ultimate definitive guide on how Spring batch could may explain how CRISPR Cas9Works read this link  https://didipostmanprojects.blogspot.com/2023/01/the-definitive-explanation-on-how.html

-- Minds, like parachutes, function best when open. ,,,

                      (o o)

 / --------oOO--(_)--OOo--------------------\

 | Wadï Mami didipostman

 | Github : https://www.github.com/didipostman 

| e-mail : wmami@steg.com.tn / didipostman77@gmail.com

 | Twitter : @MamiWad3

 | \----------------------------------------/ 

                         | |

 | \---------------/ \------------------------/








Comments

  1. The Ultimate definitive guide on how Spring batch could may explain how CRISPR Cas9Works read this link https://didipostmanprojects.blogspot.com/2023/01/the-definitive-explanation-on-how.html

    ReplyDelete

Post a Comment

Popular posts from this blog

Generative AI explaining my model

Last Gemini Answer Spring Batch as a model for CRISPR Cas9