Probyto AI Journal <h2><strong>Probyto AI Journal hosts conference papers of the recent advances in AI</strong></h2> en-US (Probyto) (Probyto) Sun, 08 Mar 2020 00:00:00 +0000 OJS 60 The LAWBO: A Smart Lawyer Chatbot <p>Artificial intelligence (AI) has evolved to the stage where it can parse intentions and churn out useful responses to practical queries. Chatbots are AI-driven pieces of software that converse in human terms. They’re not quite ready to pass the Turing test, but ready enough for many forms of commerce and messaging. With the advent and rise of chatbot adaptability, the question is not only, how to make chatbots but also, where to use it next. In recent past, chatbots have found their applications ranging from travel, personal finance, productivity and retail applications.<br>When it comes to conversing and understanding like humans, one of the most intricate domains for chatbots is the judicial system. One needs to really pour into volumes of legal books and judgement papers to analyze and investigate a case. “Justice delayed is justice denied!”. Time being the most valuable factor in this domain, chatbot seems to be a good investment for helping legal professionals to save time and effort in probing a case. LAWBO could guide and give potential ideas in drawing parallelism between cases and at the same time, answer and fetch &amp; derive relevant knowledge from the humongous amount of legal data and provide it to the lawyers.<br>We use a combination of heuristics applied on data extracted from supreme court judgments using in-house developed, state-of-the-art parsers and dynamic memory networks (DMN) for Natural Language Processing (NLP). DMN is a neural network architecture which processes input sequences and questions, forms episodic memories, and generates relevant answers, which is essentially how chatbots function. The training for question answering tasks relies exclusively on trained word vector representations and input-question-answer triplets generated from our parsers based on the judgment paper.</p> Unnamalai N, Kamalika G, Shubhashri G, Karthik Ramasubramanian, Abhishek Kumar Singh Copyright (c) 2020 Probyto Journal of AI Research Sun, 08 Mar 2020 13:18:38 +0000 The GeoAlert – Satellite Imagery based Flood Risk Assessment (FRA) system using Machine Learning <p>Assessment of flood risk zonation and landscape vulnerability to flood are crucial perspective in flood risk management. It is a standout amongst the most genuine catastrophic events in North East, India. The mighty Brahmaputra being one of the significant streams of Asia, is a trans-boundary waterway which flows through China, India and Bangladesh. Floods are an exceptionally regular event during the monsoon season. Deforestation in the Brahmaputra watershed has brought increased siltation levels, flash floods and soil erosion in basic downstream natural environment. To detect changes in land cover and NDVI rapidly and precisely, we have utilized remote sensing technology and Geographic Information Systems (GIS). In this paper, Landsat 8 Operational Land Imager (OLI) data were utilized to access landscape vulnerability to flood inundation and flood risk in Kamrup district of Assam, India. Flood inundation outline was prepared based on water and land pixels on images. We have proposed an automated flood alert generating and continuous monitoring system which analyses the recent images by extracts the number of geo-tagged pixels of water body (extracts area and contour of water and land regions) and generate when the number of pixels of water crosses the threshold, our system generates an alert. Based on the area of the water regions on the map we could estimate flood affected regions comparing with the existing data of land and water cover regions reported by the government and there by alert the officials for preventive measures. Different metrics were experimented to extract more accurate area of water logging regions. With the historical information of rainfall and area of water logging regions we could apply machine learning models to predict the possible inundation in next 48 hours. This will be more effective to move people to safe high-altitude regions during the time of flood.</p> Parvej Saleh, Rajdeep Purkayastha, Srivathshan KS Copyright (c) 2020 Probyto Journal of AI Research Mon, 09 Mar 2020 07:04:45 +0000 The Adaptive Assessment - AI technique to help people evaluate their skills <p>Adaptive assessments are designed to challenge students. High-achieving students can be challenged by more difficult questions, while students who are slightly below the average are<br>encouraged to continue moving forward by answering questions at or slightly above their current achievement level. Our solution will discuss the algorithmic approach and challenges to achieve it. Apart from the algorithm, the solution will not be complete without the entire system architecture to manage the data, secure the information, and making it available for the end-user.</p> Nirmalyan M, Srivathshan KS Copyright (c) 2020 Probyto Journal of AI Research Mon, 09 Mar 2020 00:00:00 +0000 The Adverse Drug Reaction (ADR) detection in real-time using Deep Learning Models for Pharmacovigilance Studies <p>Pharmacovigilance is defined as the science and activities relating to the detection, assessment, understanding and prevention of adverse effects or any other drug-related problem. Recognizable proof of Adverse Drug Reactions (ADRs) amid the post-marketing stage is a standout amongst the most essential objectives of goals of drug safety surveillance. Food and Drug Administration (FDA) uses The Adverse Event Reporting System (AERS) to monitor new safety concerns related to a marketed product, ensuring compliance to reporting regulations and responding to outside requests for information. The FDA receives adverse event and medication error reports directly from multiple sources, healthcare professionals including physicians, pharmacists and others and from consumers including patients, lawyers and others. The data directly collected from physicians and patients suffer from a range of limitations including under-reporting (only approximately 10% of serious ADRs are reported in AERS), over-reporting of known ADRs and incomplete data. Thus, in recent times, research focus has broadened to the utilization of other sources of data for ADR detection. In recent times, Internet has opened opportunities for public to share their opinion and feeling over Social Media, thus making it as an indispensable source for Pharmacovigilance studies. This paper presents a combined approach for Pharmacovigilance using Big Data Technology, Natural Language Processing (NLP) and public domain knowledge. The Big Data Architecture is designed in a way to capture streaming data from Twitter and web-scraped data from Daily Strength website, to capture real time reporting of ADR and storing in Hadoop File System. A retrospective analysis of public social media data is conducted for numerous post market drugs using both lexicon-based models and Deep Learning models of LSTM family. Automated classifiers are being used to identify each post with resemblance to an adverse event among English language posts and compared with existing work. The work shows promising results for deploying at mass scale for real time identification of ADR and hence improving safe Drug use across the world.</p> Srivathshan KS, Chibi Chakarvathy, Dandu Aravind Pai, Gayathri R, Pranesh MP, Parvej Reja Saleh Copyright (c) 2020 Probyto Journal of AI Research Mon, 09 Mar 2020 00:00:00 +0000 The AirSense : Smart air quality monitoring and reporting tool using IoT devices and cloud service <p>In urban areas, exposure to indoor air pollution is expanding because of numerous reasons, including the construction of more tightly sealed buildings, reduced ventilation, the use of synthetic materials for building and furnishing and the use of chemical products, pesticides, and household care products. Indoor air pollution can start inside the building or be attracted from outside. Other than nitrogen dioxide, carbon monoxide, and lead, there are various different toxins that influence the air quality in an encased space. The most susceptible groups to indoor pollution are women and children because women lung sizes are significantly smaller to male counterparts and lung volume to body volume proportion of children is significantly higher than adults. Indoor air pollution monitoring requires equal attention as outdoor pollution. With advent in sensor technology and studies showing harmful effect of indoor air pollution it is important for us to start monitoring the air quality inside our schools, offices, hospitals, home and other places. The nature of air is influenced by multi-dimensional elements including area, time, and unverifiable factors. As of late, numerous specialists started to utilize the big data investigation approach because of headways in big data applications. Sensors build on powerful Arduino board and wi-fi networking units are tested to monitor air quality of three parameters; suspended particles, organic vapors and humidity. These key parameters are monitored over period of time, the time series data is stored in cloud service, and machine learning is applied to find ways to predict and manage air quality. The paper presents the IoT device architecture, cloud application architecture and sample results for an indoor test environment. Mobile and web-based visualizations were created for the data collected from the sensors. An alarm system is also developed to notify the user when the air quality deteriorates to unhealthy level.</p> Parvej Reja Saleh, Akashjyoti Banik, Debalina Banerjee, Roshan Kumar Gupta, Srivathshan KS Copyright (c) 2020 Probyto Journal of AI Research Mon, 09 Mar 2020 00:00:00 +0000 The Fake News Detection using Dependency Tree based Recurrent Neural Network <p>The objective of this research is to solve the fake news detection problem through a linguistic and a neural network approach, based only on its content. The problem is defined as the task of identifying news with the occurrence of intentional deceptions among those which stand to merely provide accurate information. The paper provides a sequential hybrid method, combining both a linguistic understanding and a recurrent neural network model, that aims at identifying fake news.</p> Kavya Parthiban, Shruthi S, Srivathshan KS Copyright (c) 2020 Probyto Journal of AI Research Mon, 09 Mar 2020 00:00:00 +0000 The FormAssist : Deep learning methods for converting handwritten forms into digital assets <p>Customer agreement are required to follow statuary and legal requirements, which include agreements to be manually signed. In India, paper forms are still prevalent in Banking Industry. The paper forms require customers to fill a template form in capital letters and manually sign by agreeing to the terms. This creates challenge in analytical systems as the data is captured outside the system and requires time to become part of data pipeline. The future of banks is poised to be digital, however we still need historical data for train models for current data applications. This limitation is a known bottleneck in designing data applications for real time decision making. Developing Optical Character Recognition (OCR) with capabilities commensurable to that of human is still not achievable, in spite of decades of excruciating research. Due to idiosyncrasy of individual form, analysts from industry and scholastic circles have coordinated their considerations towards OCR. The work in this paper shows an efficient model to capture offline handwritten forms and convert them into digital records. The model techniques are based on deep learning methodologies and show higher accuracy for our testing set of real application forms of selected Banks. We have experimented with different feature extraction techniques to extract hand written characters in the forms. Our experimentation has evolved over time to find a generalized solution and better results. The final model uses relative position of the characters for extracting characters from the forms and Convolutional Neural Networks (CNNs) to predict the characters. The paper also discusses the serverless architecture to host the FormAssist as a REST API with model calibration feature to accommodate multiple types of forms.</p> Srivathshan KS, Saurav Kumar, Shreekanth R, Midhilesh E, Parvej Reja Saleh Copyright (c) 2020 Probyto Journal of AI Research Mon, 09 Mar 2020 00:00:00 +0000 The Microservice-based Optical Character Recognition application for offline forms <p>The majority of Indian offices still use the old school method of filling up paper-based forms and storing them. They need human interference even when digitalized. The use of<br>digitalized data in the future is inevitable as it makes easier to be interpreted and stored. In this paper, we focus on the extension and user-based implementation of the previously inhouse developed OCR model mentioned in the published paper 'FormAssist' [1]. Making the model user-friendly through a web application called OCR-WebApp using microservices<br>architecture and converting it into an end-to-end (E2E) business solution. We explain how the input image of offline filled form data flow from the user end to the controller through API<br>calls, passes through the model, processed in serverless architecture, results are stored in cloud storage and displayed to the user, all done in a few minutes. We also discuss the challenges encountered while deploying the model into a production level architecture and how we overcame those.</p> Jayeesha Ghosh, Srivathshan KS, Parvej Saleh Copyright (c) 2020 Probyto Journal of AI Research Mon, 09 Mar 2020 10:31:34 +0000 The Micro-services architecture for hosting AI/ML Models based on IoT Intelligence <p>Microservices Architecture has introduced the idea of developing the functionality as a collection of small services, each running on its own process. Historically, organizations<br>would look to monolithic application suites to overcome business problems. Functionality evaluation was based on an overall assessment of a small number of platforms against key<br>business requirements. With the introduction of microservices, it has now become much more flexible. Over the last few years, Microservices Architecture has raised to describe a<br>particular way of designing software applications as suites of independently deployable services. The idea behind microservice architecture is to build our application as many<br>independent services rather than one large codebase. Rather than accessing the majority of our data using large databases, communication is often handled with API calls between the<br>services, with each service having its own lightweight database. This paper intends to put forward the work of Microservices Architecture in an IoT platform before delivering an<br>application service. The ThingSpeak platform will provide the IoT application to analyze the data. And later on, this collected data from the sensors can be used for visualization using<br>D3.js, a javascript library for an Indoor Air Quality Monitoring System. Understanding the microservices that underpin IoT platforms help decrease time-to-market and ease IoT<br>adoption for every kind of enterprise.</p> Unalisha Gohain, Parvej Saleh, Srivathshan KS Copyright (c) 2020 Probyto Journal of AI Research Mon, 09 Mar 2020 00:00:00 +0000 The Micro-services architecture-based Text Analysis Engine <p>As the hype for social media and various other web applications increases, the need for modern scalable data management has become a sheer need to cope with the huge bulks of data. In the domain of text analytics, the data collected from various social media needs to be collected, cleaned, processed and visualized for providing various insights. This paper presents an<br>architecture relying on the microservice approach for creating a data management backend for text analysis. The microservices-based data caching, data processing, data analysis, and data visualization methods can be applied to enhance the available data for providing efficient management services to the users. As of 2018, the micro-blogging site Twitter averaged at 321 million monthly active users[1]. Twitter provides a strong emphasis on real-time information. Information relating to geolocation and entities such as author id, author name/id, source, the reaction of people towards an event, etc can be extracted, stored and further analyzed by various Big Data Tools. This work aims at applying machine learning while adopting microservice approach for analyzing events from the Twitter platform in real-time based on keywords and geolocation, and finally, propose a user-friendly visualization based on the data. The tweets are stored and fetched using the Elasticsearch search engine. Indexing and standardizing of the Elasticsearch framework are used for large scale text mining. The results obtained from the query-based&nbsp;search engine are finally visualized using the powerful d3.js library.</p> Swaswati Dutta, Parvej Saleh, Srivathshan KS Copyright (c) 2020 Probyto Journal of AI Research Mon, 09 Mar 2020 10:56:05 +0000 The PeopleReporter: Smart social media tool to detect breaking news and it's credibility <p>With a total of 4,156,932,140 internet users by 2017, the number of internet users has increased drastically, reaching 54% of the total population and counting. An increase in the total number of users means more user-generated content across several online platforms, which is predominantly real-time. The user-generated content is being leveraged by applications to derive insights into customer behavior, opinion mining, marketing and for providing niche services like banking in real-time. In recent years, we have also seen a rise in citizen journalism and public posting real-time events on social media channels. Social media has emerged as a supporting player for traditional media as well as powerful standalone expression tool for the public, and hence changing the reliance on traditional media for reports and news. Further, the increase in smartphones and better coverage of data networks has shown increased credible news sourced by mainstream media to be from Social Media. Not only media agencies but the real-time event identification can be used by security departments, disaster management, and others for quick action. The most prominent source of information is the micro-blogging site, Twitter providing geolocation and other features like time, author id, author name, source, link, people’s reaction towards that data, etc. and can be easily extracted, stored and analyzed using Big Data Tools. Entities extraction in Natural language Processing (NLP) is used for identifying the type of event and proceed further. The fundamental goal of our work is to limit the spread of falsehood by halting the proliferation of fake news in the system. This helps us in taking lead in collecting information on certain events ahead of local media platforms. For example, when an earthquake occurs, people make many posts related to the earthquake, which enables detection of earthquake occurrence promptly. Our model delivers such notifications of such events much faster than the announcements of other media sources. In this paper, we have utilized the information from the social platforms in real-time based upon some keywords and geolocation and visualized it with powerful BI tools. Continuous monitoring helps us analyzing the events occurring in the respective geolocation and defining its credibility. The credibility of such an event is detected with the help of the credit score factors developed considering multiple factors including temporal and spatial features of the reported content.</p> Srivathshan KS, Parvej Reja Saleh, Vishesh Kumar Jha, Joyfred Jesuraj A, Desu Sesha Sai Suhash Copyright (c) 2020 Probyto Journal of AI Research Mon, 09 Mar 2020 11:01:48 +0000 The Pharmacovigilance using AI <p>The Adverse Drug Effects (ADEs) are the harmful reaction towards a prescribed medicine. Pharmacovigilance is the effort to detect, assess and prevent ADEs. ADE detection in one of the most essential objective of post-marketing stage. Food and Drug Administration (FDA) uses the Adverse Event Reporting System (AERS) to monitor the reports of ADEs from pharmaceutical companies, doctors, hospital, patients and pharmacies. The major drawback is incomplete information, over-reporting on already documented ADEs and under-reporting of new ADEs. This paper concentrates on collecting tweets from Twitter which has the mentioned drug and identify the Adverse Drug Effects associated using NLP and classify them. The newly detected ADEs are reported by checking the detected ADE in a database where already reported ADEs are mentioned.</p> Rakibul Asheeque, Srivathshan KS, Parvej Saleh Copyright (c) 2020 Probyto Journal of AI Research Mon, 09 Mar 2020 11:06:11 +0000 The SecureElect - Blockchain based Electronic Voting System to Enable Online <p>In democratic systems, voting is a central piece as it gives individuals in a community to voice their opinion. E-voting was acquainted to address few concerns, such as decrease in number of voters, security and accessibility of current voting systems. However, it isn’t financially savvy and still requires full supervision by a central authority. In view of the high security and verifiability, e-voting system has been a popular expression in the continuous years. The blockchain is an emerging, decentralized, and distributed technology that promises to enhance different aspects of many industries. The blockchain with the smart contracts, rises as a decent possibility to use in improvements of more secure, less expensive, more straightforward, and ease to-utilize e-voting frameworks. This paper intends to assess prerequisites of building electronic voting frameworks and distinguishes the lawful and technological limitations of utilizing blockchain as an administration for acknowledging such systems. The paper begins by assessing a portion of the popular blockchain structures that offer blockchain as an administration followed by discussing different consensus algorithms and finally choosing the right infrastructure and the algorithm for the problem. We have built a web platform which is used to simulate the voting process with blockchain and the results of the system for transparency, security and immunity to data tampering is presented. Additionally, we have discussed the possibilities of Aadhar card as an authentication mechanism for e-voting frameworks.</p> Srivathshan KS, Elamathi S, Parvej Reja Saleh Copyright (c) 2020 Probyto Journal of AI Research Mon, 09 Mar 2020 11:22:58 +0000 The Smart Meter <p>The demand for energy is increasing as a result of the growth in both population and industrial development. To improve energy efficiency, consumers need to be more aware of their<br>energy consumption. In recent years, utilities have started developing new electric energy meters which are known as smart meters. A smart meter is a digital energy meter that measures the consumption of electrical energy and provides other additional information as compared to the traditional energy meter. The main aim of the paper is to develop remote energy<br>measurement system using ESP8266 Wi-Fi module. Furthermore, all data such as energy consumption (in KWH) in dis-aggregated form is obtained using Non-Intrusive Load Monitoring<br>Tool Kit (NILMTK). Energy dis-aggregation estimates appliance-by-appliance electricity consumption from a single meter that measures the whole home’s electricity demand. Mobile<br>and web-based visualizations were created for the data collected from the meter.</p> Roshan Kumar Gupta, Parvej Saleh, Srivathshan KS Copyright (c) 2020 Probyto Journal of AI Research Mon, 09 Mar 2020 11:28:06 +0000