Introducing BioWF - The Bioinformatics Workflow Designer
- Posted: Aug 05, 2013 at 6:00AM
Today's project is a little outside the normal thing we highlight here, but none the less, something I thought interesting and kind of cool. And sometimes, different is good, right?
BioWF is a project by Mark Smith and not only shows how you can host Windows Workflow Designer in your app, but how you can use it to create new activities too.
One of the projects that was really interesting was Trident - https://tridentworkflow.codeplex.com/. It provided a graphical designer based on Windows Workflow 3.0 to create scientific analysis applications. The .NET Bio team created some activities to introduce bioinformatics into that platform and it was a sample application that was shown off in some of the training sessions. Unfortunately, WF 3.0 was deprecated when .NET 4.0 shipped (and replaced by a completely different version of Workflow!), and the TSCB project just went dark. It also was quite heavy and slow having requirements on SQL server and some services to actually execute the workflows themselves.
But I really liked the idea of creating simple analysis programs with WF so I took the concept and created a new project – BioWF (http://markjulmar.github.io/BioWF/) which uses .NET 4.5 and .NET Bio 1.1 to provide a similar capability. It has two parts to it:
- A GUI designer which re-hosts Workflow 4.5 and provides access to a set of pre-defined activities and the core WF activities. You can create, edit and save workflows to XML based files.
- A console based runner which can take a persisted WF and execute it providing both input and output capabilities.
You can then drag various activities from the toolbox on the left. Each activity can be selected and have properties changed in the property explorer on the bottom right of the screen. As an example, let’s create a sequence and save it to a FASTA file:
1. Drag the CreateSequence activity onto the design surface (right where it says “Drag activity here”. It should have some validation errors which show up:
This is very much an alpha right now – there are some little bugs here and there and it needs to have more comprehensive activities added, but it’s a good start. If anyone is interested in helping out, adding features or just using this then please drop me a line!
Feel free to download the source and build it – you will need Visual Studio 2012 (any edition) and Windows 7 or better (where .NET 4.5 is supported).
The source downloaded and ran for me after a quick nuget reference fix-up.
Here's a snap of the Solution;
There's just something I find cool about using WF to create DNA sequences...
What is this .Net Bio thing?
NET Bio is an open source library of common bioinformatics functions, intended to simplify the creation of life science applications.
The core library implements a range of file parsers and formatters for common file types, connectors to commonly-used web services such as NCBI BLAST, and standard algorithms for the comparison and assembly of DNA, RNA and protein sequences. Sample tools and code snippets are also included.
.NET Bio has been built with specific goals in mind:
Extensibility: .NET Bio is designed to be easy for a programmer to extend with new functions, please refer to the developer documentation available on this site. Developers who extend .NET Bio are encouraged to contribute their code back to the project so that the community as a whole can benefit from their work.
Flexibility: Whatever .NET-supported language you choose, the code you write will work with .NET Bio —so the accessibility of Visual Basic®, the power of C#, the speed and conciseness of functional languages such as F# or the ad-hoc scripting capabilities of Python are all available, as are many others. As a library of common code, .NET Bio can be used to build whatever application type meets your needs, whether integrating with applications such as Microsoft Excel, building commandline or GUI applications from scratch, or creating cloud services or workflow components.
Community: .NET Bio is a community-owned open source project and welcomes participation and contributions from programmers with an interest in the life sciences. We provide forums for discussions and help, documentation and sample applications, and tools to report bugs and request new features.