Agricultural production system has been evolving into a complex business system requiring the accumulation and integration of knowledge and information from many diverse sources. In order to remain competitive, the modern farmer often relies on agricultural specialists and advisors to get information for decision making. Unfortunately assistance of the agricultural expert is not always available when the farmer needs it. In order to alleviate this problem, expert systems were identified as a powerful tool with extensive potential in agriculture.
Agriculture is a major contributor to the Indian economy. Agriculture is considered as the main and the foremost culture practiced in India. The common problem existing among the Indian farmers are they don’t choose the right crop based on their soil requirements. Due to this they face a serious setback in productivity. This problem of the farmers has been addressed through precision agriculture. Precision agriculture is a modern farming technique that uses research data of soil characteristics, soil types, crop yield data collection and suggests the farmers the right crop based on their site specific parameters. This reduces the wrong choice on a crop and increases the productivity. In this project, we are building an intelligent system, which intends to assist the Indian farmers in making an informed decision about which crop to grow depending on the sowing season, his farm’s geographical location and soil characteristics. Further the system will also provide the farmer, the yield prediction if he plants the recommended crop. This portal also will show more information regarding the recommended crop and required pesticides.
Keeping the importance of ICT enabled interventions in agriculture and providing timely expert advise to farmers, the expert system on agriculture and animal husbandry was proposed and obtained as net work project from Indian Council of Agricultural Research.
Recommender systems have become very popular in recent years and are used in various web applications, like Movie Recommendation, Book Recommendation system, etc. Recommender Systems (RSs) are software tools that are used to provide suggestions to user according to their requirement. The suggestions associate with various decision-making processes, such as which items to buy, what music to listen to. “Item” is the general term used to denote what the system recommends to users. A RS normally focuses on a specific type of item, its design, its graphical user interface, and the core recommendation technique used to generate the recommendations are all customized to provide useful and effective suggestions for that specific type of item. Broadly speaking, a RS suggests to a user those items that might be of user’s interest.
A farmer’s decision about which crop to grow is generally clouded by his intuition and other irrelevant factors like making instant profits, lack of awareness about market demand, overestimating a soil’s potential to support a particular crop, and so on. A very misguided decision on the part of the farmer could place a significant strain on his family’s financial condition. Perhaps this could be one of the many reasons contributing to the countless suicide cases of farmers that we hear from media on a daily basis. In a country like India, where agriculture and related sector contributes to approximately 20.4 per cent of its Gross Value Added (GVA), such an erroneous judgment would have negative implications on not just the farmer’s family, but the entire economy of a region. For this reason, we have identified a farmer’s dilemma about which crop to grow during a particular season, as a very grave one. The need of the hour is to design a system that could provide predictive insights and show the required correct information to cultivate the recommended crop to the Indian farmers, thereby helping them make an informed decision about which crop to grow. With this in mind, we propose a system, an intelligent system that would consider environmental parameters (temperature, rainfall, geographical location in terms of state) and soil characteristics (pH value, soil type and nutrients concentration) before recommending the most suitable crop to the user. This system also recommends the pesticides to farmer by analyzing the pest image.
Problem Definition :
Failure of farmers to decide on the best suited crop for his land using traditional and non-scientific methods is a serious issue for a country where approximately 50 percent of the population is involved in farming. Both availability and accessibility of correct and up to date information hinders potential researchers from working on developing country case studies. With resources within our reach, we have proposed a system which can address this problem by providing predictive insights on crop sustainability and recommendations based on machine learning models trained considering essential environmental and economic parameters.
Machine Learning is used across many spheres around the world. The agriculture industry is no exception. Machine Learning can play an essential role in recommendation of crop and much more. Such information, if recommended well in advance, can provide much necessary help to farmers and help them to gain more profit and also help them get accurate pesticide for pest.
Proposed System:
We to eliminate the aforementioned drawbacks, we propose an Intelligent Crop Recommendation system- which takes into consideration all the appropriate parameters, including temperature, rainfall, location and soil condition, to predict crop suitability. This system is fundamentally concerned with performing the primary function of Agro Consultant, which is, providing crop recommendations to farmers algorithms. We also provide the profit analysis on crops grown in different states which gives the user an easy and reliable insight to decide and plan the crops. We also provide the pesticide recommendation by analysing the image of pest.

Fig 1 represent the system architecture diagram of intelligent crop recommendation portal using Ml and AI. First we load the dataset and perform data processing and data cleaning. After that we store the cleaned dataset. The cleaned data set is split into training and testing. The training and testing dataset is pass to machine learning model and it show the recommendation after it. Also the training and testing dataset of images is send to neural network model and it show recommendation.
Proposed Experimental Work:
The main objective of the proposed project is
· To build a intelligent crop recommendation and pesticides recommendation system for farmers
· To improve the accuracy of system using machine learning algorithms
· To provide accurate results for the crops and pesticides
List of Modules:
The prediction system is divided into three modules:
Module 1: Data Pre-Processing
A. Load Dataset:
A data set is a collection of data. In the case of tabular data, a data set corresponds to one or more database tables, where every column of a table represents a particular variable, and each row corresponds to a given record of the data set in question.
B. Data Pre-Processing:
Data processing is, generally, “the collection and manipulation of items of data to produce meaningful information. Data preprocessing is a data mining technique which is used to transform the raw data in a useful and efficient format.
C. Cleaning Dataset:
Data cleaning is the process of fixing or removing incorrect, corrupted, incorrectly formatted, duplicate, or incomplete data within a dataset.
Module 2: Training & Testing
Train/Test is a method to measure the accuracy of your model. It is called Train/Test because you split the data set into two sets: a training set and a testing set. 85% for training, and 15% for testing. You train the model using the training set.
Module 3: Machine Learning
A. Machine Learning Model:
Analyze the data set and process data set according to Machine Learning Algorithms such as Support Vector Classifier, Support Vector Machine, Random Forest, Gaussian Naive Bayes, k-nearest neighbors, etc. We use the voting classifier for getting best results by using all this models.
B. Recommendation:
By using voting classifier, we are recommending which crop to be cultivate. The output of the recommendation system is the name of the crop should be cultivate. The output is in the form of text.
C. Result:
Result will be display in the form name and image of crop.
Module 4: Neural Network
A. Neural Network Model:
Analyze the data set an process data set according to Neural Network Algorithm such as CNN.
B. Recommendation:
By using CNN we are recommending which pesticide to be used to kill the pest. The output of the recommendation system is the list of pesticides to be used.
C. Result:
Result will be display in the form of name and image of the pesticide.
Machine Learning Algorithms:
A. Support Vector Classifier:
The objective of a Linear SVC (Support Vector Classifier) is to fit to the data you provide, returning a “best fit” hyperplane that divides, or categorizes, your data. From there, after getting the hyperplane, you can then feed some features to your classifier to see what the “predicted” class is.
B. Support Vector Machine:
Support Vector Machine(SVM) is a supervised machine learning algorithm used for both classification and regression. The objective of SVM algorithm is to find a hyperplane in an N-dimensional space that distinctly classifies the data points. The dimension of the hyperplane depends upon the number of features. If the number of input features is two, then the hyperplane is just a line. If the number of input features is three, then the hyperplane becomes a 2-D plane.
C. Random Forest:
Random forest is a flexible, easy to use machine Learning Algorithms that produces, even without hyper-parameter tuning, a great result most of the time. It is also one of the most used algorithms, because of its simplicity and diversity (it can be used for both classification and regression tasks).Random forest is a supervised learning algorithm. The “forest” it builds, is an ensemble of decision trees, usually trained with the ―bagging‖ method. The general idea of the bagging method is that a combination of learning models increases the overall result Use: random forest builds multiple decision trees and merges them together to get a more accurate and stable prediction.
D. Gaussian Naive Bayes:
The Gaussian in the GNB classifier is a probability distribution, and has the effect of comparing neural activation to the means and variances of activation in different stimulus conditions. The output of the classifier is a condition-label. GNB classifier yields condition-labels as outputs, the searchlight technique does produce 3D volumes as outputs
E. k-nearest neighbors:
K-Nearest Neighbor is one of the simplest Machine Learning algorithms based on Supervised Learning technique. K-NN algorithm assumes the similarity between the new case/data and available cases and put the new case into the category that is most similar to the available categories. K-NN algorithm stores all the available data and classifies a new data point based on the similarity. This means when new data appears then it can be easily classified into a well suite category by using K- NN algorithm. K-NN algorithm can be used for Regression as well as for Classification but mostly it is used for the Classification problems.
Voting Classifier:
A Voting Classifier is a machine learning model that trains on an ensemble of numerous models and predicts an output (class) based on their highest probability of chosen class as the output. It simply aggregates the findings of each classifier passed into Voting Classifier and predicts the output class based on the highest majority of voting. The idea is instead of creating separate dedicated models and finding the accuracy for each them, we create a single model which trains by these models and predicts output based on their combined majority of voting for each output class.
Neural Network Algorithm:
Convolutional Neural Network (CNN):
A convolutional neural network, or CNN, is a deep learning neural network sketched for processing structured arrays of data such as portrayals. CNN are very satisfactory at picking up on design in the input image, such as lines, gradients, circles, or even eyes and faces. This characteristic that makes convolutional neural network so robust for computer vision. The strength of a convolutional neural network comes from a particular kind of layer called the convolutional layer. CNN contains many convolutional layers assembled on top of each other, each one competent of recognizing more sophisticated shapes. With three or four convolutional layers it is viable to recognize handwritten digits and with 25 layers it is possible to differentiate human faces.
In Fig 2 DFD level 0 diagram has shown. The crop dataset is pass through recommendation system and the recommendation system displays the result. Similarly the pest images dataset passes through the recommendation system and it displays the result.
In Fig 3 DFD level 1 diagram has shown. The crop dataset passes through the recommendation system and it shows the result of crop as per soil attributes and recommend the crop. Similarly the pest image is passes through image recognition system it identifies the pest and shows the pesticides which should be use.
Level 0:

Level 1:

In Fig 4 use case diagram has been shown. The user has access to the soil and environmental characteristics, pest image upload, and the shown result. The python model has access to environmental characteristics, crop database, pest images folder, pest image upload, and the shows result.

In Fig 5 activity diagram has been shown. Firstly system load the dataset and do the preprocessing. Then dataset is split into training and testing. The training and testing dataset is pass through machine learning model and we get the recommended result. Similarly pest images are divided into training and testing and pass through neural network model and recommend the result.


Python:
Python offers concise and readable code. While complex algorithms and versatile workflows stand behind machine learning and AI, Python’s simplicity allows developers to write reliable systems. Developers get to put all their effort into solving an ML problem instead of focusing on the technical nuances of the language.
Additionally, Python is appealing to many developers as it’s easy to learn. Python code is understandable by humans, which makes it easier to build models for machine learning.
Many programmers say that Python is more intuitive than other programming languages. Others point out the many frameworks, libraries, and extensions that simplify the implementation of different functionalities. It’s generally accepted that Python is suitable for collaborative implementation when multiple developers are involved. Since Python is a general-purpose language, it can do a set of complex machine learning tasks and enable you to build prototypes quickly that allow you to test your product for machine learning purposes.
To reduce development time, programmers turn to a number of Python frameworks and libraries. A software library is pre-written code that developers use to solve common programming tasks.
Jupyter notebook:
The Jupyter Notebook is a living online notebook, letting faculty and students weave together computational information (code, data, statistics) with narrative, multimedia, and graphs. Faculty can use it to set up interactive textbooks, full of explanations and examples which students can test out right from their browsers. Students can use it to explain their reasoning, show their work, and draw connections between their classwork and the world outside. Scientists, journalists, and researchers can use it to open up their data, share the stories behind their computations, and enable future collaboration and innovation.
The notebook lets you write different types of text. Here, you can see formatted explanatory text, a gray block of code, and a visualization. It kind of looks like a textbook, except that this notebook can be accessed by students on their computers, and all of the code is live–students can run through each part of the computation to see the result.
We have done Module I that is Data Pre-processing, Module II that is training and testing, Module III Machine Learning Algorithms and Crop Recommendation, Module IV Neural Network and Pesticide Recommendation and Module V Web Interface
Module 1: Data Pre-processing:
A. Load Dataset:
In Fig 7 loading dataset is shown. A data set is a collection of data. In the case of tabular data, a dataset corresponds to one or more database tables, where every column of a table represents a particular variable, and each row corresponds to a given record of the data set in question.
Pandas is the most popular data manipulation package in Python, and Data Frames are the Pandas data type for storing tabular data.
The basic process of loading data from a CSV file into a Pandas Data Frame (with all going well) is achieved using the ―read_csv‖ function in Pandas.

B. Data Pre-Processing:
In Fig 8 data preprocessing is shown. Data preprocessing is, generally, “the collection and manipulation of items of data to produce meaningful information. Data preprocessing is a data technique which is used to transform the raw data in a useful and efficient format.

C. Cleaning Dataset:
In Fig 9 data cleaning is shown. Data cleaning is the process of fixing or removing incorrect, corrupted, incorrectly formatted, duplicate, or incomplete data within a dataset. The function dataframe .isnull().sum() returns the number of missing values in the dataset.MD Tools is a python module which provides a set of classes useful for the analysis and modification of protein structure.

D. Analysis of Dataset:
In data analysis is shown. Data analysis refers to the process of manipulating raw data to uncover useful insights and draw conclusions. During this process, a data analyst or data scientist will organize, transform, and model a dataset Organizations use data to solve business problems, make informed decisions, and effectively plan for the future.
Module 2: Training Data and Testing Data
Train/Test is a method to measure the accuracy of your model.
In Fig 10 training data and testing data is shown. It is called Train/Test because you split the the data set into two sets: a training set and a testing set.85% for training, and 15% for testing. You train the model using the training set. You test the model using the testing set.

Test Dataset:
Test the model means test the accuracy of the model. Scikit-learn is probably the most useful library for machine learning in Python. The sklearn library contains a lot of efficient tools for machine learning and statistical modeling including classification, regression, clustering and dimensionality reduction.
A machine learning pipeline is a way to codify and automate the workflow it takes to produce a machine learning model. Machine learning pipelines consist of multiple sequential steps that do everything from data extraction and preprocessing to model training and deployment. A machine learning pipeline is used to help automate machine learning workflows. They operate by enabling a sequence of data to be transformed and correlated together in a model that can be tested and evaluated to achieve an outcome, whether positive or negative.
Effective use of the model will require appropriate preparation of the input data and hyper parameter tuning of the model. Collectively, the linear sequence of steps required to prepare the data, tune the model, and transform the predictions is called the modeling pipeline.
Module 3: Machine Learning Algorithms
Machine Learning Algorithms
Voting Classifier:
In Fig 11 machine learning model using voting classifier is shown. A Voting Classifier is a machine learning model that trains on an ensemble of numerous models and predicts an output (class) based on their highest probability of chosen class as the output. It simply aggregates the findings of each classifier passed into Voting Classifier and predicts the output class based on the highest majority of voting.

Crop Recommendation:
Recommendation is the process of suggesting by analysis the historical data and finding a particular pattern, and suggesting on this data.
In Fig 12 single prediction is shown. Pickle is the standard way of serializing objects in Python. You can use the pickle operation to serialize your machine learning algorithms and save the serialized format to a file. Later you can load this file to deserialize your model and use it to make new predictions.
Then use pickle.dump() function to store the object data to file.Pickle.dump() function takes 3 arguments. The first argument is the object that you want to store. The second argument is the file object you get by opening the desired file in Write-binary (wb) mode. And the third argument is the key-value argument. This argument defines the protocol.

Module 4: Neural Network Algorithm
Convolutional Neural Network(CNN):
In Fig 13 and fig 14 creating of cnn module is shown. CNN is a deep learning neural network sketched for processing structured arrays of data such as portrayals. The construction of a convolutional neural network is a multi-layered feed-forward neural network, made by assembling many unseen layers on top of each other in a particular order. It is the sequential design that give permission to CNN to learn hierarchical attributes. In CNN, some of them followed by grouping layers and hidden layers are typically convolutional layers followed by activation layers.




Above fig 16 shows the result for given soil characteristics. Fig 15 shows the value entering Nitrogen, Phosphorous, Potassium, and pH in this result is taken as 91, 35, 39, 6.9 respectively. Rainfall, Temperature and Relative Humidity in the area is taken as 206mm, 23.7 C and 80% respectively. It means for the particular land and environmental conditions Rice is best crop.



Above figure 17 shows the result for given pest image. Fig 18 shows the uploaded image the uploaded image of grasshopper is get recognized as grasshopper. It means for the given pest images these pesticides are best for get rid of them.
This project is the implementation of recommending crops and recommending pesticides, where our system is going to recommend which crop should be yield and which pesticide should be used for the pest. The recommendation is made using machine learning algorithms such as voting classifier and image analysis is done using neural network algorithms like CNN. In our project there are total four modules named, Data Pre-processing, Training & Testing, Machine Learning, and Neural Network. In these four modules, we have completed module 1: Data preprocessing, which is all about load dataset, remove unwanted attributes and duplicate values from dataset and along with it we have partially completed module 2: Training and testing, that is we have split the dataset into two forms that is train and test as x and y. module 3: Machine learning Algorithms and Crop Recommendation, in this module we finally have recommend the crop. module 4: Neural Network and Pesticide Recommendation, in this module we finally recognize the pest and recommend the pesticide. Hence, we conclude that we have completed module 1, module 2, module 3, and module 4 of our project.
It can do a lot of additional features to the system. Currently, it takes necessary environmental factors as inputs and suggests a very suitable crop to be cultivated. But as the next level, the Automation part can be added as the response system to the feedback. Presently it takes all environmental factors as inputs to the system, but as an additional feature, an algorithm can be implemented to predict the one factor using another three factors. In future we can increase the accuracy of the model by using deep learning.
Further development is to integrate the crop recommendation system with another subsystem, yield predictor that would also provide the farmer an estimate of production if he plants the recommended crop. Also we can implement new system that can predict the pest by using deep learning and recommend pesticide for the pest and in what amount it should be use to kill the pest.
The Internet of Things (IoT) is a popular technology these days. By using Raspberry Pie, Audreno, and N, P, K, pH, and Humidity sensor we can automatize this manual value entering system. The integration of Raspberry Pie, Python libraries and Sensors makes a portable device.