-
Notifications
You must be signed in to change notification settings - Fork 61
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'IU-PR:master' into master
- Loading branch information
Showing
20 changed files
with
321 additions
and
8 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
--- | ||
bookCollapseSection: true | ||
title: "EyeSpy" | ||
--- |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,83 @@ | ||
--- | ||
title: "Week #1" | ||
--- | ||
|
||
# Week #1 | ||
|
||
## **Team Formation and Project Proposal** | ||
|
||
### **Team Members** | ||
|
||
| Team Member | Telegram ID | Email Address | | ||
| ----------- | ----------- | ------------------------------- | | ||
| AlexStrNik | @alexstrnik | [email protected] | | ||
|
||
### **Value Proposition** | ||
|
||
- Identify the Problem: | ||
|
||
In today's digital age, ensuring the integrity of remote exams and interviews is a big challenge. Traditional proctoring methods often invade user privacy by sending video streams to external servers for monitoring. This not only raises privacy concerns but also risks exposing sensitive data to unauthorized access. There is a clear need for a solution that balances security and privacy without disrupting the user experience. | ||
|
||
- Solution Description: | ||
|
||
Our app tackles this issue by using advanced eye-tracking technology to monitor where users are looking during remote exams and interviews. The key feature is that all video processing happens locally on the user's MacOS device. This means no data is transmitted outside, keeping everything private. Whenever the app detects that the user is not looking at the monitor, it triggers an alarm, ensuring the user remains engaged and reducing the chances of cheating. | ||
|
||
- Benefits to Users: | ||
|
||
With our app, users can enjoy a high level of privacy since all data stays on their device. The continuous eye gaze monitoring ensures that users stay focused on their tasks, upholding the integrity of exams and interviews. The app operates smoothly in the background, offering a hassle-free proctoring solution that respects user privacy and provides peace of mind. | ||
|
||
- Differentiation: | ||
|
||
Our app stands out because it processes video streams locally, unlike other proctoring solutions that send data to external servers. This approach guarantees that user data remains secure and private. Additionally, the advanced eye-tracking technology delivers precise and reliable monitoring, ensuring users stay engaged throughout their exams or interviews. | ||
|
||
- User Impact: | ||
|
||
By addressing privacy concerns and enhancing the security of remote proctoring, our app significantly improves the user experience. Students and professionals can confidently participate in exams and interviews, knowing their data is protected. Educational institutions and companies benefit from a trustworthy solution that maintains the integrity of their assessments without compromising user privacy. | ||
|
||
- User Testimonials or Use Cases: | ||
|
||
Students have found the app to be a game-changer for online exams, feeling more secure knowing their data isn't being sent anywhere. Professionals appreciate the focus it ensures during remote interviews, helping maintain fair and honest practices. Educational institutions have streamlined their proctoring processes, finding the app reliable, respectful of student privacy, and effective in maintaining exam integrity. | ||
|
||
## **Lean Questionnaire** | ||
|
||
Please answer the following questions related to the lean methodology: | ||
|
||
1. What problem or need does your software project address? | ||
|
||
Our software addresses the challenge of ensuring integrity during remote exams and interviews while protecting user privacy. Traditional proctoring methods often compromise privacy by transmitting video streams to external servers, which can lead to data breaches and unauthorized access. Our app keeps all video processing local, ensuring that sensitive data never leaves the user's device, thus maintaining privacy and security. | ||
|
||
2. Who are your target users or customers? | ||
|
||
Our target users are primarily students taking remote exams and professionals participating in remote interviews. Educational institutions and companies conducting these exams and interviews are also key users, as they need a reliable and privacy-conscious solution to ensure the integrity of their processes. | ||
|
||
3. How will you validate and test your assumptions about the project? | ||
|
||
To validate and test our assumptions, we plan to conduct an experiment involving a group of students divided into two subgroups. One subgroup will attempt to cheat during the exams, while the other will not. By monitoring both groups, we can assess the effectiveness of our eye-tracking technology and local processing in detecting cheating attempts. We will measure the success by analyzing the true/false positives and negatives to ensure our app accurately distinguishes between genuine and cheating behaviors. | ||
|
||
4. What metrics will you use to measure the success of your project? | ||
|
||
We will measure the success of our project by examining the rates of true positives, false positives, true negatives, and false negatives. True positives occur when the app correctly identifies cheating behavior, and true negatives happen when it accurately recognizes non-cheating behavior. False positives and false negatives will help us understand any inaccuracies or areas needing improvement. By analyzing these metrics, we can gauge the app's accuracy and reliability in real-world scenarios. | ||
|
||
5. How do you plan to iterate and pivot if necessary based on user feedback? | ||
|
||
Based on user feedback, we plan to iterate and pivot by refining the eye-tracking algorithms and improving the user interface to enhance accuracy and usability. If users report issues or suggest enhancements, we will prioritize those adjustments in our development cycle. Continuous feedback will guide our updates, ensuring the app meets user needs and maintains high standards of privacy and security. By staying responsive to user input, we can ensure our app evolves to better serve its purpose. | ||
|
||
## **Leveraging AI, Open-Source, and Experts** | ||
|
||
Our app uses cutting-edge AI and open-source technology to track eye gaze effectively. We start with an open-source PyTorch model called gaze-estimation, which predicts where you're looking based on images of your eyes. To make it run faster on MacOS, we convert this model to CoreML, letting us use the computer's hardware more efficiently. This means our app works in real-time without needing a high-end video card. | ||
|
||
We also reviewed many studies to find the best models for real-time eye gaze estimation. This helps us ensure our app uses the latest and most efficient technology available. By combining advanced AI, open-source solutions, and expert research, we provide a reliable, high-performing eye-tracking app that keeps all data processing local for maximum privacy. | ||
|
||
## **Defining the Vision for Your Project** | ||
|
||
- Overview: | ||
|
||
Our project aims to create a privacy-focused MacOS app that uses eye-tracking technology to ensure the integrity of remote exams and interviews. By leveraging advanced AI and local processing, we ensure that user data remains secure on their device. The app monitors eye gaze in real-time and triggers an alarm if the user is not looking at the screen, helping to prevent cheating without compromising privacy. | ||
|
||
- Schematic Drawings: | ||
|
||
![app_scheme](/2024/EyeSpy/App_Scheme.png) | ||
|
||
- Tech Stack: | ||
|
||
Our tech stack combines several powerful tools and frameworks to deliver a high-performing app. For app development, we use CoreML, Swift 5, Vision Framework, AppKit, and UrlSession. To prepare the model, we rely on coremltools, Python, and Pytorch. For the backend, we use Python and FastAPI. This combination ensures that our app is both efficient and capable of processing data locally, maintaining user privacy while delivering real-time eye-tracking capabilities. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,107 @@ | ||
--- | ||
title: "Week #2" | ||
--- | ||
|
||
# **Week #2** | ||
|
||
## **Week 2 - Choosing the Tech Stack, Designing the Architecture** | ||
|
||
### **Tech Stack Selection** | ||
|
||
We chose our tech stack to make sure our app is fast, reliable, and easy to develop. | ||
|
||
The Vision Framework is perfect for detecting face landmarks in real-time, which is crucial for accurate eye tracking. CoreML lets us use Apple's Neural Engine and M family chips to process gaze estimation quickly on the user's device without needing external help. | ||
|
||
For the backend, we picked FastAPI because it’s simple and stable, making it easy to build and maintain. SwiftUI was our choice for the user interface because it has a great API, is easy to learn, and allows for rapid development. | ||
|
||
In short, our tech stack includes the Vision Framework for real-time face detection, CoreML for efficient model processing, FastAPI for a reliable backend, and SwiftUI for fast UI development. This setup ensures our app works well and is easy to build and maintain. | ||
|
||
### **Architecture Design** | ||
|
||
1. **Component Breakdown**: | ||
|
||
Our system consists of three main components: the Server, the App, and a simple Web page. The Server handles backend processes and logging. The App performs eye tracking and proctoring, ensuring all actions are processed locally on the user's device. The Web page allows administrators to monitor user behavior and detect cheating in real-time. | ||
|
||
2. **Data Management**: | ||
|
||
There's no need for extensive data management since we don't store any user data. Proctoring sessions are temporary and exist only while they are running. This approach ensures maximum privacy and reduces the risk of data breaches. | ||
|
||
3. **User Interface (UI) Design**: | ||
|
||
The UI design follows Apple’s Human Interface Guidelines, ensuring a seamless and intuitive user experience. SwiftUI helps us create a responsive and visually appealing interface that aligns with Apple’s design principles. | ||
|
||
4. **Integration and APIs**: | ||
|
||
Our integration relies on robust APIs to ensure smooth communication between components. The app communicates with the server using efficient and secure endpoints provided by FastAPI, enabling real-time monitoring and updates. | ||
|
||
5. **Scalability and Performance**: | ||
|
||
By leveraging CoreML and the Vision Framework, our app runs efficiently on user devices, making it scalable without the need for powerful external servers. The use of Docker for deployment ensures that our server environment is consistent and can handle multiple instances easily. | ||
|
||
6. **Security and Privacy**: | ||
|
||
Security and privacy are paramount. All processing is done locally on the user’s device, ensuring that no sensitive data leaves the device. The backend server only logs necessary actions without storing any personal information. | ||
|
||
7. **Error Handling and Resilience**: | ||
|
||
For error handling, the app relies on Swift’s type and memory safety, ensuring that all errors are managed effectively. The server uses logging to track and manage errors, providing resilience and easy troubleshooting. | ||
|
||
8. **Deployment and DevOps**: | ||
|
||
Deployment is handled using Docker, allowing for a consistent and isolated environment. This ensures that our app and server can be easily deployed and scaled across different platforms without compatibility issues. | ||
|
||
### **Week 2 questionnaire:** | ||
|
||
1. Tech Stack Resources: | ||
|
||
For our tech stack resources, we primarily use StackOverflow, Pytorch, and Apple Documentation. These resources provide the necessary information and support for working with the various technologies and frameworks in our project. | ||
|
||
- [Vision Framework](https://developer.apple.com/documentation/vision) | ||
- [Core ML](https://developer.apple.com/documentation/coreml/) | ||
- [Swift UI](https://developer.apple.com/tutorials/swiftui/) | ||
|
||
2. Mentorship Support: | ||
|
||
Currently, I do not have any mentorship support. I work independently and rely on self-learning and online resources to advance my knowledge and skills. | ||
|
||
3. Exploring Alternative Resources: | ||
|
||
To supplement my learning, I explore alternative resources such as online tutorials, forums, and documentation. These help me gain a deeper understanding of the technologies I am working with and find solutions to any challenges I encounter. | ||
|
||
4. Identifying Knowledge Gaps: | ||
|
||
As a web developer, I need to learn more about Apple frameworks and machine learning. These areas are crucial for our project's success, and I am focusing on improving my skills in these domains. | ||
|
||
5. Engaging with the Tech Community: | ||
|
||
Engaging with the tech community is important, but currently, I mainly interact with online communities and forums like StackOverflow to seek advice and share knowledge. | ||
|
||
6. Learning Objectives: | ||
|
||
My main learning objective is to understand how SwiftUI apps are built. This involves mastering SwiftUI, AppKit, and other related Apple frameworks to create a seamless and efficient app. | ||
|
||
7. Sharing Knowledge with Peers: | ||
|
||
At this moment, I do not share knowledge with peers since I have no teammates and there is no mentorship support. My learning is self-directed and independent. | ||
|
||
8. Leveraging AI: | ||
|
||
We use an LLM as a Cognitive Model inside Xcode to provide better code completions. This AI support helps improve coding efficiency and accuracy, making development smoother and more productive. | ||
|
||
### **Tech Stack and Team Allocation** | ||
|
||
| Team Member | Track | Responsibilities | | ||
| ------------------ | ---------- | --------------------- | | ||
| Aleksandr Strijnev | Everything | App, Sever, Web Panel | | ||
|
||
### **Weekly Progress Report** | ||
|
||
Our team did some initial research on available solutions and began writing parts of the app in SwiftUI. We focused on understanding how to implement eye-tracking features and integrating them with the Vision Framework and CoreML. The initial coding efforts in SwiftUI are aimed at setting up the user interface and ensuring it aligns with Apple’s Human Interface Guidelines. This week’s progress lays the foundation for developing a reliable, privacy-focused eye-tracking app for remote exams and interviews. | ||
|
||
### **Challenges & Solutions** | ||
|
||
This week, we faced a significant challenge due to the lack of tutorials on handling frames from AVCaptureSession in SwiftUI apps. This made it difficult to find clear guidance on integrating real-time video processing with our app’s user interface. | ||
|
||
### **Conclusions & Next Steps** | ||
|
||
This week, we made progress in understanding and starting the integration of AVCaptureSession with SwiftUI, despite the challenges. Next, we will focus on performing face detection and drawing detected landmarks for debugging. This will help us ensure that our eye-tracking feature works accurately and efficiently in real-time. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
--- | ||
title: "Week #3" | ||
--- | ||
|
||
# **Week #3** | ||
|
||
## **Developing the first prototype, creating the priority list** | ||
|
||
- **Technical Infrastructure**: | ||
|
||
The infrastructure and backend development are scheduled for another week. This week, we concentrated on the main feature – face detection. | ||
|
||
- **Backend Development**: | ||
|
||
Backend development tasks will be addressed in the upcoming weeks. Our current focus is on the front-end and core features. | ||
|
||
- **Frontend Development**: | ||
|
||
Significant progress was made on the app's front end. We successfully implemented face recognition and landmark drawing. These features are now functional, providing a solid foundation for further development. | ||
|
||
- **Data Management**: | ||
|
||
Data management isn't applicable in our case since we don't store any user data. Proctoring sessions are temporary and exist only while running. | ||
|
||
- **Prototype Testing**: | ||
|
||
For testing, I used myself as a subject in different lighting conditions and environments to ensure the accuracy and reliability of face detection and landmark drawing. This practical approach helped us refine the feature and prepare for more extensive testing. | ||
|
||
![Week3_Progress](/2024/EyeSpy/Week3_Progress.jpg) | ||
|
||
> Me and my eyes | ||
## **Weekly Progress Report**: | ||
|
||
Our team focused on landmark recognition and successfully implemented eye cropping. This involved understanding the coordinate space of landmarks and using the low-level Quartz framework for image cropping. We needed to ensure real-time cropping with minimal resource usage, preserving performance for CoreML processing. | ||
|
||
### **Challenges & Solutions** | ||
|
||
One of the biggest challenges was understanding the coordinate space for landmarks and performing image cropping using the Quartz framework. This was essential to achieve real-time processing without using excessive resources. We tackled this by diving deep into the Quartz documentation and experimenting with different approaches until we achieved the desired performance. | ||
|
||
### **Conclusions & Next Steps** | ||
|
||
This week, we made significant progress in landmark recognition and eye cropping, setting the stage for the next phase. The next steps include converting our PyTorch model to CoreML and integrating it into our recognition flow. This will enable us to leverage the full capabilities of CoreML for efficient and accurate eye-tracking. |
Oops, something went wrong.