This study utilizes advanced technologies to address water contamination by analyzing data collected from the Red River, Rio Grande, and Trinity River in Texas City, USA, between March 2023 and March 2024. The dataset comprises seven critical water quality parameters—conductivity, pH, turbidity, dissolved oxygen (DO), total and fecal coliform, chemical oxygen demand (COD), and nitrate—ensuring compliance with U.S. government standards for safe and clean drinking water. The proposed Efficient AI-based Water Quality Prediction and Classification (EAI-WQP) model aims to accurately predict water pollution parameters, particularly those influenced by industrial activities. Leveraging Apache Spark, a powerful big data processing framework, the model enables real-time data handling and analysis for effective pollution management. To enhance prediction accuracy, the model's parameter tuning is optimized using the Firefly Algorithm (FA). Furthermore, an Adaptive Neuro-Fuzzy Inference System (ANFIS) classifier is integrated into the model, combining fuzzy logic with neural networks to classify water quality into pollutant and non-pollutant categories. Comparative evaluations against established machine learning techniques such as GRU-ARIMA, SVM, and Random Forest demonstrate the superior performance of the EAI-WQP model in terms of accuracy, precision, F1 score, and recall. The study aims to analyze water pollution in Texas City using advanced AI methodologies, develop the EAI-WQP model for accurate forecasting of water quality parameters, implement real-time big data processing with Apache Spark, optimize model performance using the Firefly Algorithm (FA), classify water quality using ANFIS, and demonstrate the model's superiority over traditional machine learning methods.