Início

  • BotCity Maestro Updates

    BotCity Maestro Updates

    👋 Hello!

    We are excited to announce a new set of developments that our team has been working on over the last two weeks.

    Our cloud orchestrator BotCity Maestro received major user experience improvements and new features!

    Over the next sections we will cover what changed and guide you through the new features! I hope you enjoy it!

    Simplified Sign Up

    Do you have an e-mail and a password in mind? Great! That’s all you need!

    New sign up form

    New Layouts

    From the new login and password recovery screens all the way to a major product redesign we refreshed BotCity Maestro design and worked to make it more functional and pleasant to work with.

    Login screen

    Your door into BotCity Maestro now not only looks better but it is also more functional and bring you easy access to our community channels.

    New login screen

    The password recovery screen also received a visual update and bug fixes.

    Password recovery screen

    New Home Screen

    The redesigned home screen now features four cards allowing you quick access to download the BotCity Studio SDK, our Documentation Portal, BotCity Academy and the BotCity Maestro.

    BotCity Maestro Home screen

    Log Management

    The log management was only available via the BotCity Command Line Interface (CLI) or via SDK. Logs visualization and download were the only available options via the BotCity Maestro portal.

    Today we are glad to announce the availability to not only visualize and download but also create and exclude logs directly from the BotCity Maestro interface.

    Clicking on the New Log button will take you to the Log creation screen where you can define the Log label, which is the unique ID to be used when interacting with this Log via the BotCity Maestro SDK or API as well as the columns.

    Log creation screen

    Another major change in regards to logs is that they are no longer limited or tied to a single Automation. Now you can create as many logs as you desire to better organize your data and automation management.

    Try it out now!

    Runner Management

    We recently updated our terminology with the intent to simplify the understanding of what Machines are. Recent updates on the BotCity Maestro replaced all occurrences of the term Machine with the more appropriate term Runner.

    To remember, a Runner is a service created by BotCity which runs on a host (computer, virtual machine, cloud-based host, container or some other form). A host can have one or more BotCity Runners being executed in parallel depending on the use case and hardware capabilities.

    In the past we had to create Runners via the BotCity CLI using a command like the one describe below:

    BotCLI machine new -machineId vm01

    Now you can create Runner instances directly from the BotCity Maestro interface by clicking on the New Runner button.

    New runner screen

    The Runner card contains the other operations that you can perform:

    • View instance information
    • View Runner instance log file
    • Capture Screenshot
    • Edit the instance
    • Remove the instance
    Runner card with latest screenshot and action menus

    Bot Management

    Another great change that we are proud to present is the new feature to manage Bots via the BotCity Maestro interface.

    Before this feature was added, the only way to handle the lifecycle of a Bot (Deploy, Release and Update) was via our command line interface and exclusion of bots was not allowed.

    Starting now, you will be able to perform all those operations directly from the BotCity Maestro user interface with a couple of clicks.

    From the new My Bots menu you can now deploy new bots and versions, mark versions of bots as released, update the bot code and exclude them.

    Here is how to do it.

    Deploy

    Deploy is the process used to add new bots or new versions of existing bots on the BotCity Maestro platform.

    You can deploy a bot by simply clicking on the Deploy button and filling the form with the Bot ID, version, file and technology used. Very similar to the syntax of the command line which was:

    BotCLI bot deploy -botId demoWeb -version 1.0 -file demoWeb-1.0.tar.gz -python
    Deploying a new Bot using the BotCity Maestro interface

    Update

    Update is the process used to overwrite or replace the code or executable that a given bot is associated with. Please note that this process is irreversible and will overwrite an existing version even if it is marked as release so please be careful.

    You can update a bot by simply clicking on the pencil icon on the right side of the line associated with your desired bot version and selecting a new file to replace the current one. That is much simpler than the command line which was:

    BotCLI bot update -botId demoWeb -version 1.0 -file demoWeb-1.0.tar.gz -python
    Updating an existing Bot using the BotCity Maestro interface

    Release

    Release is the process used to mark which of the bot available versions is the one currently being used when executing the tasks. This version will be downloaded by the BotCity Runner and executed on your host.

    You can define the released version by simply clicking on the ribbon icon (highlighted by the red arrow below).

    Marking a version as the Release version

    The equivalent procedure via the command line interface was:

    BotCLI bot release -botId demoWeb -version 2.0

    Delete

    A new feature! To remove a deployed version, you can simply click on the trash can icon (highlighted by the red arrow) and confirm on the confirmation dialog that will be displayed.

    Deleting a deployed bot version

    Create your Account

    Don’t have a BotCity Account? Sign up for Free

    More on the Way

    We hope you are enjoying the new features as much as we are while developing and using them.

    As always, if you have questions or find any sort of issues or would like to suggest a feature, feel free to reach out to us via one of our community channels.

    I won’t give you spoilers but there are many cool features coming up over the next weeks! 🚀

    Keep an eye out and make sure to follow us on the social media channels so you can be notified when we release them!

    Have fun automating! 🦾🤖

  • Python RPA + Orchestration – Here it comes.

    Python RPA + Orchestration – Here it comes.

    There is a growing trend for developing RPA solutions as ordinary software using traditional programming languages like Python. As technical squads are involved in such a process, it has become more common to choose coding instead of a low-code solution.

    Benefits of Coding

    There are many benefits of using programming languages to develop your RPAs:

    • Use thousands of open source frameworks for automation-related tasks.
    • Easy to reuse your solutions through modularization (i.e., creating libraries)
    • Automations developed in open technology instead of proprietary format.
    • Adopting software engineering best practices like Design Patterns, Refactoring, Automated Tests, CI/CD.
    • Customize your technology stack based on your needs.
    • Best use of computational resources through software optimization.

    Orchestrating Python RPAs

    However, the development of RPA automation is just one step in delivering a solution to production since you must be able to:

    • Deploy your automations into runtime environments.
    • Schedule your automations.
    • Manage tasks in Queues
    • Monitor executions.
    • Trigger alerts and notifications.
    • Handle errors immediately.

    BotCity Maestro Orchestrator

    BotCity developed a Cloud Orchestrator that addresses all those issues. Now you can deploy and orchestrate your Python RPAs using Command Line Interface (CLI), APIs and web platform.

    Home Screen

    On the Home Screen, a dashboard is shown displaying basic information, so now you have big picture of your entire operation.

    BotCity Maestro Home Screen

    Tasks

    The Task Queue shows tasks in execution, ready to be executed and finished. Each card represents a single task. The color in the bottom bar indicates the state of the task.

    BotCity Maestro Task Queue

    Using the New Task feature, you can create a new task for a specific activity directly from the portal:

    BotCity Maestro New Task Module.

    You might also create tasks through API or CLI. Just a single HTTP Post or CLI command and a new task is going to the Queue.

    Logs

    Logs are a compelling way to track the execution of your automation and collect metrics. You can set different columns for each log table and log information in real-time from a single Python command.

    BotCity Maestro Logs Module.

    Alerts

    When we have multiple automations running at the same time, it is a challenge to visualize the entire operation. Alerts are used to provide small messages that can describe some specific aspect of given automation. Just a single line of Python code and it is done.

    BotCity Maestro Alerts Module.

    Runtime Environments

    Machines are runtime environments used to execute automations. It can be a virtual machine, a container or even a physical machine. It is a computing resource for execution. This module lets you visualize and manage such environments.

    BotCity Maestro Runtimes Module.

    Dashboards

    When it comes to monitoring and managing a complex operation with multiple automations, it is often necessary to have a dashboard that shows the status and critical KPIs. Using BotCity Connectors you can to bring your RPA statistics into Google Data Studio and Power BI.

    BotCity Orchestrator Dashboard Module

    Free Version

    BotCity Maestro Community lets you explore the orchestration of Python RPAs for free. Just signup for free and start using it today.

  • Free Python Automation Course

    Free Python Automation Course

    BotCity Academy is Online 🎓

    We’re pleased to announce that BotCity Academy is online. 🎉

    It is a platform to learn how to develop and manage automations using open source and BotCity. It is totally free.

    The courses are going to provide video lessons, articles, examples, challenges and other references to teach concepts related to automation ops.

    Get Started with BotCity Python

    In our first course, we provide an introduction to BotCity’s technological stack, explain how to setup your development environment and create your first automation through the following lessons:

    • Welcome
    • BotCity
    • Download BotCity Studio SDK
    • Development Environment Setup
    • Hello Bot
    • Desktop Automations
    • Web Automations
    • PDF Processing

    If you don’t have a BotCity account yet, just sign up for free now.

    After login, simply click on the BotCity Academy card to start your learning journey.

  • BotCity Maestro and Documentation Portal

    BotCity Maestro and Documentation Portal

    👋 Hello!

    We are excited to announce two major developments that our team has been working on over the last two months.

    Our cloud orchestrator BotCity Maestro received a major architectural, performance upgrades and a handful of new features.

    Our Developers Portal was merged with our various API reference websites giving origin to our new unified Documentation Portal as to create a place for everything BotCity!

    BotCity Maestro

    From the signup to orchestration of your tasks and dynamic dashboards, everything was rebuilt as to bring you a more modern, performant and robust solution. All that with 100% retro compatibility with existing SDK and API code.

    The experience continue the same, but we have some important new features:

    Login using your e-mail and password

    Here the major change was the replacement of the Login field with the E-mail. Now instead of your login just simply type your e-mail and BotCity Maestro password to access your organization workspace.

    New BotCity Login panel

    Task Queue Performance and Live Update

    Fast access to information is paramount when orchestrating automations and other processes.

    We realized that our Task Queue page started to present some delays for intense operations where thousands of tasks were created every day. With that in mind, our team refactored the UI and the backend to make sure the information about your tasks is available almost instantaneously.

    Moreover, with the previous version users were required to constantly reload the page to obtain newer information about the Task Queue and verify if a task that was queued had start processing, if it was finished and which finishing status it had. Now the BotCity Maestro’s task queue page will automatically update in a reactive format and update your task cards as soon as the status of the task changes. 🚀

    Give it a try and let us know what you think.

    Task Queue

    New Tasks – Directly from the UI

    Up to now, most users created tasks with either the BotCity Command-line Interface (CLI) or via the BotCity Maestro SDK. The “New Task” menu was not generally available.

    We are glad to introduce now the exciting dynamic “New Task” feature.

    Using the New Task feature, you can create a new task for a specific activity directly from the portal. New Task

    Creating a new task is as simple as clicking on the New Task button and confirming the action on the subsequent screen.

    For activities where parameters are involved, the New Task screen will be dynamically construct the proper form with the best components for each field and data type associated. 

    New Task

    Now operators can simply login into BotCity Maestro and create tasks with or without parameters with ease.

    This is the first version of this feature and it will soon receive updates.

    Give it a try now and let us know if you have suggestions. We are excited to hear what you think.

    Alerts Live Update

    For enhanced experience while monitoring your operation, now the Alerts page will automatically load new alerts as they are generated on the platform. This means that you no longer need to reload the page to receive new notifications.

    Other Improvements

    Many other improvements were introduced with this new version, mainly architectural changes for enhanced performance, availability and robustness.

    The Log and Result Files features received a new pagination feature and performance improvements.

    Documentation Portal

    Now you no longer need to bookmark a bunch of links or browse around to find related information about BotCity’s command line interface, a feature at BotCity Maestro or even the Java and Python open-source frameworks.

    We got you covered with our new unified documentation portal.

    Documentation Portal

    This new Documentation Portal is a living organism and it will receive constant updates for APIs, Plugins, Features, Tutorials and much more!

    No need to bookmark it as we made the URL super easy to remember… it is:

    https://documentation.botcity.dev

    Hello, Hola, Olá, こんにちは, Ciao, Hallo, 你好, i18N 🌐

    We are adding Internationalization support to our documentation portal!

    You can select your preferred language via the top menu icon near the search bar:

    As of now, the core language of our documentation is English but our team is working hard to add support to more languages starting with Portuguese.

    As a rule of thumb, if a content is not yet available at your language of choice the system will automatically fallback to English.

    More on the Way

    We hope you are enjoying the new features as much as we are while developing and using them.

    As always, if you have questions or find any sort of issues or would like to suggest a feature, feel free to reach out to us via one of our community channels.

    Have fun automating! 🦾🤖

  • BotCity Studio SDK 2.16.0

    The latest version of our development suite is now available for our community users.

    This release brings lots of improvements for BotCity Studio, Runner and CLI as well many bug fixes.

    Make sure to download and update your setup by login into your BotCity Maestro and clicking at Download BotStudio.

    Let’s take a look at the shiny new features and the bugs that got fixed with this new release of our SDK.

    BotCity Studio

    Jumping from the 2.13.0 to 2.16.0, BotCity Studio 8 new features and had a total of 9 major bugs squashed.

    Login with E-mail

    Following the new BotCity Maestro version, the new update of BotCity Studio requires users to log in with their e-mail and password.

    New BotCity Studio Login

    Click and Drag to Crop

    We changed the way users interacted with the UI panel to make the image cropping more natural.

    Before this change, users were required to click on the top-left corner of an area of interest, move the mouse to the bottom-right corner, click and move back to the top-left corner and finally click again to confirm the crop area. While this method allowed for a tight selection it felt unnatural to most users.

    Now you can simply click and drag to select your desired crop area.

    New click and drag image crop method

    Preferences Panel

    This new addition will make it easy for you to select the default language for the OCR component under the document processing module when dealing with images and photographic/scanned PDF files, the font size for the code editor and much more.

    New Preferences panel

    Dark Mode

    Is a developer tool really a developer tool without dark mode? Now you can easily switch between light and dark mode using our brand new preferences panel.

    BotCity Studio.exe for Windows

    Microsoft Windows users now can skip the BotStudio.bat and jump directly into double clicking the new BotCity Studio executable file shipped with the Windows SDK.

    Others Changes

    Other improvements include the addition of scrollbars into the document processing panel and support for both F9 and ⏩ on MacOS for screen capture.

    BotCity Runner

    Our BotCity Runner received many important new features and a fix to a memory leak impacting clients with large volume of tasks being executed over a short period of time.

    Headless Execution

    Now you can run the BotCity Runner in headless mode (without graphical user interface). This means that it can now be used with headless servers, including containers such as Docker and even serverless frameworks such as AWS Lambda and Azure Functions. How cool is that?

    The SDK package offers two scripts to launch BotCity Runner:

    • BotRunner-gui: starts the BotCity Runner in graphical mode and requires you to click the start button to communicate with the BotCity Maestro orchestrator.
    • BotRunner: starts the BotCity Runner in the background and automatically establishes the connection with the BotCity Maestro orchestrator.
    ⚠️ Important ⚠️
    
    In case you are invoking the BotCity Runner via the command line without one of our wrapper scripts described above, it will start into headless mode by default. To revert to the graphical start up you will need to add the -gui flag when starting the code.

    Custom Python Codes

    Up to now, developers were asked to use our Template Project for Python projects. While it provides lots of awesome features, it imposes a format that can be tricky to integrate if you already have a codebase developed with Python and you are looking for a way to easily orchestrate it or if you just want to have a single Python file and not need to bother with the structure of a Python package.

    We heard you and now the BotCity Runner support the execution of custom Python projects.

    To develop a Python project in this new format you will need a minimum of two files:

    • bot.py: This is the file that will be invoked to start your bot. Here you can do anything you would like.
    • requirements.txt: Here is where you will describe your external Python dependencies such as pandas, numpy, etc, so that they can be installed by the BotCity Runner before executing your code.

    After you have your files, you can simply compress your folder (make a zip file or a tarball) and use the BotCity CLI to deploy, update and release this new code into the BotCity Maestro orchestrator.

    Keep an eye on our YouTube channel for more details about this feature as well as a tutorial over the next week.

    BotCity CLI

    Our command line interface received a new command to allow you to cancel tasks and the help functionality was updated as well.

    Task Cancel

    Now you can cancel tasks using the BotCity CLI instead of finalizing them with an error status. To do so is really simple.

    # For Windows
    > BotCLI.bat task cancel -taskId 12345
    
    # For Linux and MacOS
    > BotCLI.sh task cancel -taskId 12345
  • Automating Android Apps

    Automating Android Apps

    When we think about process automation, we immediately think about Desktop and Web. This happens because the vast majority of processes to be automated are available through web pages or are in applications that run on more common and consolidated operational systems such as Windows.

    With digitalization and the increasing need for mobility and portability, we have the emergence of several applications for mobile devices, especially in Android environments, and the possibility of automating mobile environments becomes increasingly necessary.

    Automation in Android Environments

    Currently there are some alternatives used for automation in Android environments focused on automated tests, such as Selendroid and Appium.

    In the case of Selendroid, despite its compatibility with different versions of Android, it is only possible to automate one app at a time in addition to requiring special permissions in the application.

    Appium, the most used framework for mobile automations, requires the configuration of an HTTP server as well as the usage of UIAutomator driver to translate the automation commands. Moreover, Appium does not support older versions of Android making it necessary to use other tools to automate legacy versions.

    Using the BotCity framework it is possible to build complete automations of applications and processes on an Android system, quickly and in a very simplified way, similarly to how it is done in Desktop applications, mimicking the experience of a human user.

    Accessing Android Environments

    As an alternative to accessing an Android system, we can use tools that mirror the screen of a given device, such as Team Viewer. Or tools that emulate a complete Android system, such as Blue Stacks or Android Emulator, the emulator used in the Android Studio IDE.

    Team Viewer

    TeamViewer uses an Android app and a Desktop app that connects to the device through an ID provided by the app. The device screen is mirrored and it can be accessed through the Desktop.

    Android Emulator

    The Android Emulator simulates devices on your computer, enabling you to access and test different devices and applications without having a physical device.

    BlueStacks

    Similarly to Android Emulator, BlueStacks is a simulator in which you can emulate an Android system without the need for a real device. Simply installing the emulator it is possible to have access to a complete environment without the need to make specific configurations.

    Emulated device home screen

    In this way, using a tool that provides access to an Android environment and leveraging the BotCity Desktop Framework we can automate any Android application.

    In this article we will use BlueStacks to build a basic example and demonstrate how the Desktop Bot works to automate an Android application.

    A Practical Example

    Prerequisites

    For this example, you will need Python 3.7 or newer as well as the Blue Stacks emulator, which you can download using this link.

    Setup

    After downloading, install the Blue Stacks emulator. The installation does not require specific configurations, just follow the steps in the installer and at the end the environment will be ready for use.

    Note that you can adjust the display settings as you see fit. For this example the default Blue Stacks settings are being used.

    BlueStacks settings options

    In this example, we will automate the process of filling out a form using the Jotform application. For that we will get the application from the Play Store and create the example form using the Information Request template.

    Install Jotform app

    With everything installed and the environment already configured, let’s create a new Python project of the Desktop Bot type using the project template through the CookieCutter command. You can find information on how to create a new project at this link.

    Desktop Bot project template

    Source Code

    The code we will use in our example basically finds the form fields and fills them with the data.

    We won’t be diving into the code details for this example but you can see how it looks like on the snippet below.

    Source code example

    You can download the code for this example clicking here and visiting the BotRepository.

    Complete Execution

    Conclusion

    In this article, we covered strategies to develop automations in Android systems and the existing alternatives to access and configure this type of environment. With a small example we were able to show that the BotCity framework is also capable of operating on Android systems with ease. In a few steps you can access the environment and automate basically any application or process in the same way as for Desktop applications. This becomes a great alternative if an application or a certain process can only be accessed via an Android device.

  • Abandone Regex ou conversão para XML para analisar documentos PDF

    Abandone Regex ou conversão para XML para analisar documentos PDF

    Descubra agora uma nova abordagem que imita a forma como a visão humana lida com a leitura de documentos

    Ao desenvolver código para ler documentos PDF automaticamente, o uso expressões regulares (regex) ou a conversão do documento para um formato estruturado como XML para analisá-lo são abordagens muito comuns. Em ambos os casos, você precisa descobrir regras específicas (análise de regex ou XML) para cada campo no documento.

    Vamos ver um exemplo de análise de alguns campos de documento usando regex:

    Agora, analisando um documento XML:

    Como pode ser visto, o desenvolvimento do leitor pode ser muito trabalhoso dependendo do número de campos no documento.

    Além disso, ambas as abordagens são muito sensíveis a alterações no documento, como omitir um campo ou alterar sua posição. Mesmo que essa alteração pareça mínima ao visualizar o documento, ela pode quebrar o analisador, pois não é baseado na estrutura visual do documento.

    Agora, vamos dar uma olhada neste problema de uma perspectiva diferente. Por que os humanos ainda conseguem ler um documento mesmo que a posição ou os campos sejam alterados? A resposta é bem simples: humanos não leem documentos levando em consideração a posição dos campos no documento. Para nós, geralmente, buscamos uma relação entre rótulo e valor:

    Em vermelho temos os rótulos que são basicamente a definição do campo em questão e em azul temos o valor. Normalmente, os campos (rótulos e valores) são agrupados por algum contexto para facilitar o processo de leitura, mas se alterarmos a posição dos campos no documento, os humanos ainda poderão entender o documento sem problemas.

    E se fosse possível usar o mesmo conceito ao criar analisadores para ler documentos automaticamente? E se houvesse uma ferramenta que permitisse gerar o código do analisador automaticamente conforme você clica nos documentos e valores do campo?

    Vamos falar sobre o BotCity Documents

    BotCity Documents é um framework que permite criar facilmente analisadores e ler documentos, usando as linguagens de programação Python ou Java, da mesma forma que você leria naturalmente um documento, estabelecendo uma relação entre rótulos e campos.

    Usando a interface intuitiva do BotCity Studio e a geração automática de código junto com o framework BotCity Documents para análise de documentos, o código para analisar um determinado campo no documento é gerado de maneira bastante simples:

    Passo 1 – Selecione o campo no documento

    Passo 2 – Selecione a área de leitura para o campo escolhido

    Passo 3 – Código é gerado automaticamente

    Esse processo é repetido para cada campo no documento que você precisa ler e seu leitor personalizado é criado em minutos.

    Ao aproveitar os plugins da BotCity para integrar seu código ao seu provedor de OCR favorito, como Google Cloud Vision, Azure Cognitive Services ou até mesmo o projeto de código aberto Tesseract, leitores criados com BotCity Documents podem ser estendidos para lidar de forma transparente não apenas com PDFs baseados em texto, mas também digitalizados e arquivos de imagem usando o código.

    Tudo isso significa menos dor de cabeça criando vários leitores, analisadores e integração com serviços de terceiros.

    Take a look into BotCity Documents in action and see how you can boost your team’s productivity by constructing parsers not only faster but in a maintainable and more reliable way.

  • De PDF para JSON em minutos. Conheça o BotCity Docs.

    De PDF para JSON em minutos. Conheça o BotCity Docs.

    Descubra agora uma nova abordagem que imita a forma como a visão humana lida com a leitura de documentos

    Os aplicativos e serviços corporativos têm a necessidade constante de ler, analisar e obter informações de uma enorme variedade de documentos, como faturas, contracheques, documentos fiscais e outros.

    Ao lidar com formatos estruturados, como CSV ou planilhas, a tarefa é trivial, mas quando se trata de documentos PDF digitalizados ou baseados em texto e imagens, isso se torna uma dificuldade.

    Para resolver esse problema, os desenvolvedores geralmente usam expressões regulares (regex) ou convertem o documento em um formato estruturado como XML para analisá-lo. Este processo não é apenas muito trabalhoso dependendo do número de campos no documento, mas também é altamente sensível a alterações no documento ou campos ausentes.

    Os serviços de pagamento por página baseados em nuvem que oferecem uma combinação de modelos pré-criados e geradores de analisadores de documentos baseados em IA estão na moda, mas na maioria das vezes esses serviços são de nicho e quando expostos a documentos fora do escopo predefinido de modelos disponíveis estes serviços conseguem apenas lidar com alguns dados tabulares estruturados de PDFs baseados em texto. Além disso, é necessário um esforço considerável e um grande conjunto de dados para treinar o modelo de IA que analisará os arquivos com um nível de confiança aceitável para processar um lote de documentos com sucesso.

    Nós, humanos, somos mais resilientes a mudanças em documentos quando se trata de mudanças de posicionamento pois nossa visão e cérebro estão sempre procurando uma relação entre rótulos e valores.

    Vamos falar sobre o BotCity Documents

    BotCity Documents é um framework que permite criar facilmente analisadores e ler documentos, usando as linguagens de programação Python ou Java, da mesma forma que você leria naturalmente um documento, estabelecendo uma relação entre rótulos e campos.

    Usando a interface intuitiva do BotCity Studio e a geração automática de código junto com o framework BotCity Documents para análise de documentos, o código para analisar um determinado campo no documento é gerado de maneira bastante simples:

    Passo 1 – Selecione o campo no documento

    Passo 2 – Selecione a área de leitura para o campo escolhido

    Passo 3 – Código é gerado automaticamente

    Esse processo é repetido para cada campo no documento que você precisa ler e seu leitor personalizado é criado em minutos.

    Ao aproveitar os plugins da BotCity para integrar seu código ao seu provedor de OCR favorito, como Google Cloud Vision, Azure Cognitive Services ou até mesmo o projeto de código aberto Tesseract, leitores criados com BotCity Documents podem ser estendidos para lidar de forma transparente não apenas com PDFs baseados em texto, mas também digitalizados e arquivos de imagem usando o código.

    Tudo isso significa menos dor de cabeça criando vários leitores, analisadores e integração com serviços de terceiros.

    Dê uma olhada no BotCity Documents em ação e veja como você pode aumentar a produtividade de sua equipe construindo analisadores não apenas mais rápidos, mas de maneira sustentável e confiável.

  • How to create Desktop Automations just like Selenium

    How to create Desktop Automations just like Selenium

    Web pages and applications use HTML and Javascript to provide an interface for the user of a page or system. Since those technologies are interpreted by the browser, the web application’s code, or at least the user interface’s part, is open to anyone accessing the page.

    Seeing how those codes are produced by people (programmers) and interpreted in their original form (there is no format conversion like when a compiler is being used), they can be comprehended by anyone with knowledge in web applications development.

    All of those characteristics lead to web automations being created by direct interaction with the page elements, with Selenium being a good example, especially thanks to Selenium IDE for Chrome. In this type of automation, the developer reads and swaps values from the interface components using, mostly, the explicit identifiers of these components in the source code. To make that possible, Selenium provides a web driver built into the browser that allows one to access and modify the source code when navigating the web.

    Example code to retrieve the value of a textbox with id “textfield_id”.

    But what about Desktop?

    When we move to the desktop applications environment, we see a different scenario. Unlike in the web environment, with open technologies, protocols and patterns followed by many companies, we find closed technologies supplied by different companies. Starting with the operational system, we can find processes to automate in Windows, Linux or MacOS.

    In the case of the most popular desktop system, Windows, the applications are binary files whose interpretation is a lot harder than a web page. Besides, the final graphic interface presented to the user may be provided by Windows native GUI, multiplatform GUI Toolkits or virtual machine constructed environments like Java.

    One approach to deal with all of those scenarios in a desktop environment is to recognize the graphic interface components with computer vision, and interact with them through the same interface used by the final user: mouse and keyboard events.

    Here, we shall use BotCity Studio and BotCity Framework, and you can create your account here.

    A Practical Example

    Let’s see a practical example and show how the robot interacts with the interface components in a Desktop environment. Below is a screenshot of Fakturama, a business based application to create invoices, delivery notes, orders, reminders, and more.

    Screenshot of Fakturama, a business based application to create invoices, delivery notes, orders, reminders, and more.

    If we want to click in the “New product” option, we can grab a cut of this interface element, generating an image to identify it using a Desktop RPA framework:

    Cut from the interface for the “New product” menu item

    In this case, we are using BotCity Framework. The code used to find and click at this element on the screen is the following:

    Internally, BotCity Framework constantly sweeps the screen in search of the component containing that visual representation and, when it has been found, a click event is generated at the component’s position. The matching parameter is the confidence level and waiting_time is the time limit in milliseconds of the search. Therefore, the developer does not work with fixed coordinates within the source code. Rather, they are determined by computer vision algorithms at runtime. This way, even if the component appears in a new position, perhaps because a new item was inserted into a menu, the automation keeps working.

    You may request a new community license of BotCity Studio and try building with it.

    The same principle of seek and click can be used for any other interface component, even for information input components. Now, let’s suppose we want to insert a new value in text box below:

    To do that, we we’ll cut out a visual representation of the label related to the text box:

    In this case, however, we won’t be clicking at the label, but beside it, using the code below:

    The click_relative method performs a click at a position x pixels to the right (or left, if negative) and y pixels below (or above, if negative) the position of an anchor object — in this case, the label. After that, to insert data in the text box, you can use the paste("name value") which outputs a string straight from the clipboard.

    But you must be asking yourself just how laborious it would be to cut out all those visual elements, right? And imagining how you would need to use image editors, remember to save the files, etc… To ease that part of the process, there is the BotCity Studio, a complementar tool for your development IDE that allows you to collect screenshots, cut out visual representations for the components and generate the source code automatically.

    The animation below shows the moment a developer cuts an interface component using BotCity Studio and the component’s image and source code to find it are automatically generated.

    BotCity Studio, select the desired element and code is automatically generated

    Full Process Example

    In the video below, I show you how to create a desktop robot to automatically register new products using Fakturama.

    In less than 15 minutes the automation flow is produced. It is worth checking it out!

    Conclusion

    In this article, we have discussed the differences between automations in Web and Desktop environments. For web automations, the challenges are smaller because the technologies are open and allow for an easy time creating automations. In the Desktop environment, we must resort to more sophisticated solutions to interact with the different technologies used on it.

    We have shown how this technology works and how to create Desktop automations using the BotCity Framework and the BotCity Studio.

    Want to check it out for real? Create an account now!

  • No more Regex or XML conversion to parse PDF Documents

    No more Regex or XML conversion to parse PDF Documents

    Discover now a new approach that mimics the way human vision address document reading

    When creating parsers to read PDF documents automatically, the use of regex rules or converting the document to a structured format like XML to parse it are very common approaches. In both cases, you need to figure out specific rules (regex or XML parsing) for each field in the document. 

    Let’s see an example of parsing some document fields using regex:

    Now, parsing an XML document:

    As can be seen, the parser development can be very laborious depending on the number of fields in the document.

    Moreover, both approaches are very sensitive to changes in the document, like omitting a field or changing its position. Even if this change seems minimal when visualizing the document, it might break the parser since it’s not based on document visual structure.

    Now, let’s take a look at this problem from a different perspective. Why can humans still read a document even if the position or fields are changed? The answer is pretty simple: humans don’t read documents taking into account the position of the fields in the document. For us, usually, we look for a relationship between label and value:

    In red we have the labels that are basically the definition of the field in question and in blue we have the value. Usually, fields (labels and values) are grouped by some context to make the reading process easier, but if we change the position of the fields in the document, humans can still understand the document without any trouble.

    What if it was possible to use the same concept when creating parsers to read documents automatically? What if there was a tool that let you generate the parser code automatically as you click in the field documents and values?

    Let’s talk about BotCity Documents

    BotCity Documents is a framework which allows you to easily create parsers and read documents using Python or Java programming language, in the same way as you naturally would read a document, by establishing a relation between labels and fields.

    Using BotCity Studio intuitive interface and automatic code generation alongside the BotCity Documents framework for document parsing, code to parse a given field in the document is generated pretty simply:

    Step 1 – Select the field in the document

    Step 3 – Code is generated automatically

    This process is repeated for each field in the document you need to read and your custom parser is built in minutes.

    By leveraging the BotCity plugins to seamlessly integrate with your favorite OCR provider, such as Google Cloud Vision, Azure Cognitive Services or even the open-source project Tesseract, BotCity Documents can be extended to transparently deal with not only text-based PDFs but also scanned PDFs and image files using the same codebase.

    All this means less headache creating multiple readers, parsers and integration with third-party services.

    Take a look into BotCity Documents in action and see how you can boost your team’s productivity by constructing parsers not only faster but in a maintainable and more reliable way.