Workflow Studio
Data Workflow Automation and
Natural Language Processing

High-performance, scalable, big-data workflow technologies that integrate language processing and analytics, data-mining and data cleaning into a single, seamless platform.

Data Pipeline
Data Pipeline
Slider

Workflow Studio Workflow Automation Server

Secure, High-Performance, Scalable, Workflow Automation,
Queue Management, and Natural Language Processing.

Human Language Technology Enhanced by Artificial Intelligence

Workflow Studio provides queue management and workflow automation for advanced workflow processing, machine translation and natural language processing tasks.

Workflow Studio is Omniscien’s newest product. Many of the tools provided in Workflow Studio have been used and evolved by the Omniscien team for nearly 14 years. In recent years some of the tools were made available for customer use such as Language Studio Drop Folders, which was accompanied by LSScript and LSTools. These tools have all been updated and are being released as a feature enhanced versions and have been combined with several other modules that collectively make up the Workflow Studio 1.0 product.

Earlier versions of Language Studio Drop Folders, LSTools and LSScript have been deployed successfully in numerous organizations around the world that needed workflow automation and natural language processing. Usually, but not always, these tools were deployed within the organizations in conjunction with machine translation as pre- and post-processing workflows. Some workflows were developed for customers by the Omniscien team developed, with many customers building their own very impressive workflows over time. Examples of some of the workflows that have been built are provided below.

At the same time, the Omniscien team has been adding features and enhancing the tools for their own internal use when gathering, processing, cleaning, and preparing data for building custom machine translation engines, and training machine learning and artificial intelligence-based models. With the release of Workflow Studio, these powerful tools are now available for anybody to use for high-volume data workflow automation.

What can you do with Workflow Studio?

Workflow Studio, as its name suggests, is designed to simplify complex workflows. To do this, many power processing tools are provided that are connected by LSScript, Omniscien’s high-performance JavaScript runtime engine. JavaScript is a very common skillset due to most websites being heavily dependent on JavaScript in conjunction with HTML. If your team has a web-developer, then your team already has a Workflow Studio developer who can build complex language processing workflows quickly and easily.

The list below is a small sample of some of the things that our customers have been able to achieve with Workflow Studio:

  • Automatically detect the language of a document and submit an appropriate machine translation engine so as to convert all documents to English.
  • Analyze thousands of files to automatically categorize them by domain topic (i.e. medical, technology, news, etc.)
  • Analyze documents and extract data out into a normalized form.
  • Connect to a web server application to automatically determine the size of a job to be processed by a Language Service Provider and give an instant quote. Automatically determine the language, document format, number of pages, words, and characters.
  • Build custom pre- and post-processing for your machine translation workflows.
  • Mine data from remote websites, extract text, and analyze the content for valuable information.
  • Determine if customer feedback has a positive or negative sentiment.
  • Automatically determine the encoding of documents and convert them to UTF-8.
  • Extract people and organization names from large bodies of text.
  • Transcribe videos and convert them to text files.
  • Connect and process data across multiple remote systems. Manage the workflow between each system, tracking each task at a job level.

The above list is just the beginning. What will you automate with Workflow Studio? To discuss your ideas, contact an Omniscien team member for a demo and more detail.

Workflow Studio is available as a server platform that scales from one to hundreds of servers to meet the performance requirements of the most demanding environments.

Register now to participate in the
Workflow Studio 1.0 Beta Program

We are accepting a limited number of beta users during the beta period. Sign-up now to get early access before the full release.

Core Functionality

Workflow Studio is designed to scale to be able to process large volumes of data. It is optimized for environments like AWS where capacity can be added and removed on demand. The features are so powerful that all of Omnsicien’s own data processing and workflow is managed with the exact same tools.

The following 5 core functional modules underpin the platform and make workflow automation and language processing accessible to all.

Data Automation Diagram

Workflow Automation Server

At the heart of the Workflow Studio platform is the Workflow Automation Server which executes all the workflows and contains all the Natural Language Processing (NLP), text and file manipulation, and workflow scripting engine features. A range of other technologies is also embedded in the Workflow Studio platform such as web crawling, document format conversion, and data extraction.

When there is more than one Workflow Automation Server instance they can be configured for different roles, different capacities, and different purposes. A Workflow Automation Server can be configured to process specific kinds of jobs or a range of jobs. Each instance can determine the number of concurrent jobs of each job type.

Dynamic Job Distribution

Workflow Studio operates as a standalone system or as a distributed job management platform designed to scale and spread large job workloads across multiple servers.

After assigning one instance of Workflow Studio Workflow Automation Server to operate in the Dynamic Job Distribution role, all job requests will be centralized and managed via an easy to use web-based portal. In the load balancer role, new instances of Workflow Automation Servers auto-register their presence and can be added and removed from the pool dynamically without any manual effort. As load increases or decreases, new servers can be added or removed.

Dynamic scaling based on load is ideal for cloud environments such as AWS, Microsoft Azure, or Google Cloud that can quickly instantiate new instances to handle sudden bursts of traffic. Each server instance can be configured to process one or more job types and for appropriate job volumes based on server configuration and capacity.

Folder and Data Source Monitoring

Jobs can be started or registered by a variety of different means. A simple REST API call from a custom application adds a job to the job queue ready for processing and tracking to completion. However, often there are data sources where the presence of a file or a record can be detected to automatically trigger the job registration process.

Folders can be monitored using a wide variety of file storage systems. Folders could be on the local disk, network disk, or remotely on AWS S3, FTP, SFTP, Google Drive, OneDrive Business, Dropbox, Box.com. Mail server folders can also be monitored. Files are automatically detected, copied to the server, processed and the output returned. Output can take the form of a new file being delivered, a REST API call to notify an external system that a job is complete, or an email containing the processed output.

Natural Language Processing and Data Processing

Embedded in Workflow Studio are a range of Natural Language Processing (NLP) tools. These tools analyze and automatically process text for a wide variety of tasks. The list below is just a partial list of the types of NLP related tools that are embedded in Workflow Studio and accessible by LSScript and LSTools:

  • Language Identification: Identifying the language of sentences and files.
  • Sentiment Analysis: Determine if the text is happy, sad, or neutral.
  • Syntax Parsing: Grammatically analyze the structure of sentences.
  • Part of Speech: Determine the part of speech of each word in a sentence.
  • Named Entity Recognition: Extract people names, organization names, locations, dates, currencies, and other entities.
  • Term Extraction and Generation: Extract predefined and relevant terms from a given text.
  • Domain Identification: Identify the sphere and domain of a given text.
  • Document Alignment: Match bilingual documents in different languages as document pairs.
  • Sentence Alignment: Match bilingual sentences from pairs of documents.
  • Word Stemming: Reduce words back to their stem form.
  • Optical Character Recognition (OCR): Analyze images and PDF files and convert them into text or Microsoft Office Documents.
  • Automated Speech Recognition (ASR): Recognizing and identifying text from speech.
  • Document Conversion: Convert documents between a variety of different formats.
  • Web Crawling: Download entire websites.
  • Data Mining: Analyze large bodies of content and extract useful data.
  • Machine Translation: Translate text across 600+ language pairs.

And many more...

Workflow Scripting

Workflow Automation Server embeds LSScript, a high-performance ECMA compliant JavaScript runtime engine that is ideal for microservices and workflows. It removes the isolation between various tools and processes deployed within Workflow Studio in a variety of programming languages. Tools are exposed to the LSScript runtime via the LSTools Helper Object. LSTools provides easy access to some of the most advanced NLP functions, sub-applications such as web crawling, and a comprehensive set of file, data, and text manipulation features.

Because LSScript utilizes JavaScript, any developer that knows how to build a web page already has the basic skills to start building workflows. The most complex of NLP functions can be accessed with just a few lines of code. Complex language processing has never been easier and more accessible.

Feature Overview

Each feature is built on a core of Artificial Intelligence, Machine Learning and Natural Language Processing

Machine learning enables machines to work more like humans so that humans don't have to work more like machines. Each feature is designed to augment human intelligence, enhance productivity, increase quality, and reduce cost. Artificial intelligence enables processing and organization of data that simply not be cost-effective or feasible with a human only approach.

Workflow Studio incorporates hundreds of features and functions for file, job, data, and language processing. The below feature list is just the tip of the iceberg in terms of features and functionality. Talk to an Omniscien team member for a demo to better understand the features and capabilities.

CIS Hardened Docker Containers

  • On-premises or private cloud software shipped as Docker Containers
  • All Docker Containers are hardened to Center for Internet Security (CIS) standards
  • Docker containers are updated frequently with the latest security patches and software updates.

LSTools Toolkit

  • A comprehensive toolkit for NLP, file, data, and text manipulation.
  • Integrated into LSScript for easy workflow utilization.

LSScript Workflow Scripting

  • High-performance ECMA compliant JavaScript that has full access to all the tools and features exposed by LSTools.

Language Studio Integration

  • Language Studio connectors provide seamless processing for document translation and workflow before and after translation.

Media Studio Integration

  • Subtitle Optimized Machine Translation and other media related workflows are pre-configured out-of-the-box.

Easy Integration

  • Workflow Studio can monitor a wide range of sources to register jobs for processing.
  • Job sources include:
    Email, FTP and SFTP, Local Disk, REST API (Call Out), REST API (Call In), AWS S3, Google Drive, Dropbox, Box.com, OneDrive Business, and more.

Job Queue Management

  • Job queues are managed via a simple and easy to use REST API or secure portal-based web interface.
  • Add, pause, delete, cancel, rerun, adjust priorities, and check the status of jobs.

Job Completion Callback

  • When a job is completed (success or fail) and as a job progresses, the status can be sent to a callback REST API in an external application.

Multi-Layered Job Dependancies

  • Jobs can be linked and defined dependent on earlier jobs being completed successfully.
  • Workflows can be built by processing the outputs of each job as inputs to the secondary jobs.

Job Status Monitoring

  • Track job status in real-time.
  • Standardized feedback processes and approaches for workflow status tracking.

Automated Drop Folder Processing

  • Process thousands of files by simply copying them into a pre-designated folder.
  • Files are automatically submitted and returned to another folder on completion.

Automated House Keeping and Cleanup

  • Post job cleanup and general house keeping can be performed to ensure that systems health remains in optimal condition and not littered with stale files and data.