Skip to content

jaluiovilash/duplicatefinder-mern

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Duplicate Finder Dashboard

A comprehensive web application for processing, analyzing, and managing duplicate records in student data. Built with React, Node.js, Express, and MongoDB.

πŸš€ Features

Authentication & Security

  • Email/Password Authentication: Secure login with JWT tokens
  • Google OAuth: Sign in with @paruluniversity.ac.in Google accounts
  • Role-Based Access: User and admin roles with proper authorization
  • Secure Storage: Passwords hashed with bcrypt, tokens stored securely

Data Processing

  • CSV File Upload: Upload and process student data files
  • Duplicate Detection: Intelligent algorithm to identify potential duplicates based on:
    • Aadhar number (exact match)
    • Application ID (exact match)
    • Phone number (exact match)
    • Name + DOB (fuzzy matching with Levenshtein distance)
    • Email (exact match)
  • Automatic Grouping: Duplicates automatically grouped and flagged
  • File History: Stores last 10 processed files per user

Dashboard & Analytics

  • Home Page: Overview of all processing activity
    • Total files processed
    • Total records analyzed
    • Clean records count
    • Duplicates found
    • Year-wise and month-wise performance graphs
  • Data Insights: Detailed metrics for the most recent file
    • File health score
    • Duplicate risk assessment
    • Record breakdown
  • Real-time Updates: All metrics fetched from MongoDB

Mark Duplication Screen

  • Interactive Table: View and edit all record fields
  • Inline Editing: Edit name, DOB, phone, email, Aadhar, and notes
  • Auto-save: Changes automatically saved after 2 seconds of inactivity
  • Manual Save: Fallback save button for manual saves
  • File Selector: Dropdown to select from last 10 processed files
  • Duplicate Flagging: Toggle potential duplicate status
  • Re-grouping: Re-run duplicate detection on processed files

Export & Integration

  • PDF Export: Generate professional PDF reports with all record data
  • Excel Export: Export to .xlsx format with proper formatting
  • API Gateway: Send processed data to external systems via REST API
  • Configurable Outbound: Set up custom API endpoints for data integration

Security Features

  • Helmet.js: Security headers protection
  • Rate Limiting: Prevents brute force attacks (5 attempts per 15 min for auth)
  • Input Validation: Comprehensive validation using express-validator
  • CORS Configuration: Controlled cross-origin resource sharing
  • JWT Expiry: Tokens expire after 30 days

πŸ“‹ Prerequisites

  • Node.js v16 or higher
  • MongoDB Atlas account (free tier) or local MongoDB
  • Google Cloud Platform account (for OAuth)
  • npm or yarn package manager

πŸ”§ Installation & Setup

Detailed setup instructions are available in setup.md.

Quick Start

  1. Clone the repository

    git clone <your-repo-url>
    cd duplicate-finder-dashboard
  2. Install dependencies

    # Backend
    cd backend
    npm install
    
    # Frontend
    cd ..
    npm install
  3. Configure environment variables

    • Copy .env.example to .env in both root and backend folders
    • Update MongoDB URI, JWT secret, and Google OAuth credentials
    • See setup.md for detailed instructions
  4. Run the application

    # Terminal 1 - Backend
    cd backend
    npm start
    
    # Terminal 2 - Frontend
    npm run dev
  5. Access the application

πŸ“š Documentation

πŸ—οΈ Project Structure

duplicate-finder-dashboard/
β”œβ”€β”€ backend/                    # Backend API
β”‚   β”œβ”€β”€ config/                # Database configuration
β”‚   β”œβ”€β”€ controllers/           # Request handlers
β”‚   β”œβ”€β”€ middleware/            # Auth, validation, etc.
β”‚   β”œβ”€β”€ models/               # Mongoose models
β”‚   β”œβ”€β”€ routes/               # API routes
β”‚   β”œβ”€β”€ scripts/              # Utility scripts
β”‚   └── utils/                # Helper functions
β”œβ”€β”€ src/                       # Frontend React app
β”‚   β”œβ”€β”€ components/           # Reusable components
β”‚   β”œβ”€β”€ pages/               # Page components
β”‚   β”œβ”€β”€ config/              # Configuration
β”‚   └── utils/               # Utility functions
β”œβ”€β”€ public/                   # Static assets
β”œβ”€β”€ setup.md                 # Setup instructions
β”œβ”€β”€ security.md              # Security documentation
└── README.md               # This file

πŸ”’ Security

This application implements multiple layers of security:

  • Password hashing with bcrypt
  • JWT token-based authentication
  • Rate limiting on authentication endpoints
  • Input validation and sanitization
  • Security headers with Helmet.js
  • CORS configuration
  • MongoDB injection prevention

For detailed security information, see security.md.

πŸ§ͺ Testing

Manual Testing

  1. Authentication Flow

    • Register with @paruluniversity.ac.in email
    • Login with credentials
    • Login with Google OAuth
  2. File Processing

    • Upload CSV file with student data
    • Verify duplicate detection
    • Check data on Home and Data Insights pages
  3. Mark Duplication

    • Edit records inline
    • Test auto-save functionality
    • Re-run duplicate grouping
    • Toggle potential duplicate flags
  4. Export Functionality

    • Export to PDF
    • Export to Excel
    • Verify downloaded files

API Testing with cURL

Examples available in setup.md.

🌐 Deployment

Backend Deployment

  1. Update environment variables for production
  2. Use MongoDB Atlas connection string
  3. Set secure JWT secret
  4. Configure production CORS origins
  5. Enable HTTPS

Frontend Deployment

  1. Build the production bundle: npm run build
  2. Deploy the dist folder to your hosting service
  3. Update VITE_API_URL to point to your production backend

πŸ“Š Technology Stack

Frontend

  • React 18
  • TypeScript
  • Tailwind CSS
  • React Router
  • Shadcn UI Components
  • Recharts (for graphs)

Backend

  • Node.js
  • Express.js
  • MongoDB with Mongoose
  • JWT for authentication
  • bcrypt for password hashing
  • Express Validator
  • Helmet.js
  • Rate Limiting

Development Tools

  • Vite
  • ESLint
  • Git

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

πŸ“ License

This project is licensed under the MIT License.

πŸ‘₯ Support

For issues, questions, or contributions, please open an issue on GitHub.

πŸ”„ Changelog

Version 1.0.0 (Current)

  • Initial release with complete authentication system
  • CSV file processing with duplicate detection
  • Interactive Mark Duplication screen
  • PDF and Excel export functionality
  • Dashboard with analytics and graphs
  • API gateway for external integrations
  • Comprehensive security features

🎯 Future Enhancements

  • Bulk edit functionality
  • Advanced search and filtering
  • Email notifications
  • Scheduled file processing
  • Advanced duplicate detection algorithms
  • Mobile-responsive design improvements
  • Dark mode support

About

A tool to find duplicates in a CSV file

Resources

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published