A comprehensive web application for processing, analyzing, and managing duplicate records in student data. Built with React, Node.js, Express, and MongoDB.
- Email/Password Authentication: Secure login with JWT tokens
- Google OAuth: Sign in with @paruluniversity.ac.in Google accounts
- Role-Based Access: User and admin roles with proper authorization
- Secure Storage: Passwords hashed with bcrypt, tokens stored securely
- CSV File Upload: Upload and process student data files
- Duplicate Detection: Intelligent algorithm to identify potential duplicates based on:
- Aadhar number (exact match)
- Application ID (exact match)
- Phone number (exact match)
- Name + DOB (fuzzy matching with Levenshtein distance)
- Email (exact match)
- Automatic Grouping: Duplicates automatically grouped and flagged
- File History: Stores last 10 processed files per user
- Home Page: Overview of all processing activity
- Total files processed
- Total records analyzed
- Clean records count
- Duplicates found
- Year-wise and month-wise performance graphs
- Data Insights: Detailed metrics for the most recent file
- File health score
- Duplicate risk assessment
- Record breakdown
- Real-time Updates: All metrics fetched from MongoDB
- Interactive Table: View and edit all record fields
- Inline Editing: Edit name, DOB, phone, email, Aadhar, and notes
- Auto-save: Changes automatically saved after 2 seconds of inactivity
- Manual Save: Fallback save button for manual saves
- File Selector: Dropdown to select from last 10 processed files
- Duplicate Flagging: Toggle potential duplicate status
- Re-grouping: Re-run duplicate detection on processed files
- PDF Export: Generate professional PDF reports with all record data
- Excel Export: Export to .xlsx format with proper formatting
- API Gateway: Send processed data to external systems via REST API
- Configurable Outbound: Set up custom API endpoints for data integration
- Helmet.js: Security headers protection
- Rate Limiting: Prevents brute force attacks (5 attempts per 15 min for auth)
- Input Validation: Comprehensive validation using express-validator
- CORS Configuration: Controlled cross-origin resource sharing
- JWT Expiry: Tokens expire after 30 days
- Node.js v16 or higher
- MongoDB Atlas account (free tier) or local MongoDB
- Google Cloud Platform account (for OAuth)
- npm or yarn package manager
Detailed setup instructions are available in setup.md.
-
Clone the repository
git clone <your-repo-url> cd duplicate-finder-dashboard
-
Install dependencies
# Backend cd backend npm install # Frontend cd .. npm install
-
Configure environment variables
- Copy
.env.exampleto.envin both root and backend folders - Update MongoDB URI, JWT secret, and Google OAuth credentials
- See setup.md for detailed instructions
- Copy
-
Run the application
# Terminal 1 - Backend cd backend npm start # Terminal 2 - Frontend npm run dev
-
Access the application
- Frontend: http://localhost:5173
- Backend API: http://localhost:5000
- Setup Guide: Complete setup instructions including MongoDB Atlas and Google OAuth
- Security Guide: Security best practices and configurations
- API Documentation: Backend API endpoints and usage
duplicate-finder-dashboard/
βββ backend/ # Backend API
β βββ config/ # Database configuration
β βββ controllers/ # Request handlers
β βββ middleware/ # Auth, validation, etc.
β βββ models/ # Mongoose models
β βββ routes/ # API routes
β βββ scripts/ # Utility scripts
β βββ utils/ # Helper functions
βββ src/ # Frontend React app
β βββ components/ # Reusable components
β βββ pages/ # Page components
β βββ config/ # Configuration
β βββ utils/ # Utility functions
βββ public/ # Static assets
βββ setup.md # Setup instructions
βββ security.md # Security documentation
βββ README.md # This file
This application implements multiple layers of security:
- Password hashing with bcrypt
- JWT token-based authentication
- Rate limiting on authentication endpoints
- Input validation and sanitization
- Security headers with Helmet.js
- CORS configuration
- MongoDB injection prevention
For detailed security information, see security.md.
-
Authentication Flow
- Register with @paruluniversity.ac.in email
- Login with credentials
- Login with Google OAuth
-
File Processing
- Upload CSV file with student data
- Verify duplicate detection
- Check data on Home and Data Insights pages
-
Mark Duplication
- Edit records inline
- Test auto-save functionality
- Re-run duplicate grouping
- Toggle potential duplicate flags
-
Export Functionality
- Export to PDF
- Export to Excel
- Verify downloaded files
Examples available in setup.md.
- Update environment variables for production
- Use MongoDB Atlas connection string
- Set secure JWT secret
- Configure production CORS origins
- Enable HTTPS
- Build the production bundle:
npm run build - Deploy the
distfolder to your hosting service - Update
VITE_API_URLto point to your production backend
- React 18
- TypeScript
- Tailwind CSS
- React Router
- Shadcn UI Components
- Recharts (for graphs)
- Node.js
- Express.js
- MongoDB with Mongoose
- JWT for authentication
- bcrypt for password hashing
- Express Validator
- Helmet.js
- Rate Limiting
- Vite
- ESLint
- Git
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License.
For issues, questions, or contributions, please open an issue on GitHub.
- Initial release with complete authentication system
- CSV file processing with duplicate detection
- Interactive Mark Duplication screen
- PDF and Excel export functionality
- Dashboard with analytics and graphs
- API gateway for external integrations
- Comprehensive security features
- Bulk edit functionality
- Advanced search and filtering
- Email notifications
- Scheduled file processing
- Advanced duplicate detection algorithms
- Mobile-responsive design improvements
- Dark mode support