Skip to content

LetterLiGo/letterli-arxiv-daily

Repository files navigation

Contributors Forks Stargazers Issues

Updated on 2024.11.08

Usage instructions: here

Table of Contents
  1. Text-to-Image Safety
  2. LLM Security & Privacy
  3. LLM Agent Security & Privacy
  4. Audio Deepfake

Text-to-Image Safety

Publish Date Title Authors PDF Code
2022-11-10 Red-Teaming the Stable Diffusion Safety Filter Javier Rando et.al. 2210.04610 null

(back to top)

LLM Security & Privacy

Publish Date Title Authors PDF Code
2024-10-28 Systematically Analyzing Prompt Injection Vulnerabilities in Diverse LLM Architectures Victoria Benjamin et.al. 2410.23308 null
2024-10-22 DEAN: Deactivating the Coupled Neurons to Mitigate Fairness-Privacy Conflicts in Large Language Models Chen Qian et.al. 2410.16672 link
2024-10-20 Jailbreaking and Mitigation of Vulnerabilities in Large Language Models Benji Peng et.al. 2410.15236 null
2024-10-18 From Solitary Directives to Interactive Encouragement! LLM Secure Code Generation by Natural Language Prompting Shigang Liu et.al. 2410.14321 null
2024-10-17 Large Language Models are Easily Confused: A Quantitative Metric, Security Implications and Typological Analysis Yiyi Chen et.al. 2410.13237 null
2024-10-07 Aligning LLMs to Be Robust Against Prompt Injection Sizhe Chen et.al. 2410.05451 link
2024-10-06 Towards Secure Tuning: Mitigating Security Risks Arising from Benign Instruction Fine-Tuning Yanrui Du et.al. 2410.04524 null
2024-10-04 MoJE: Mixture of Jailbreak Experts, Naive Tabular Classifiers as Guard for Prompt Attacks Giandomenico Cornacchia et.al. 2409.17699 null
2024-10-03 PathSeeker: Exploring LLM Security Vulnerabilities with a Reinforcement Learning-Based Jailbreak Approach Zhihao Lin et.al. 2409.14177 null
2024-10-01 Extracting Memorized Training Data via Decomposition Ellen Su et.al. 2409.12367 null
2024-10-19 Securing Large Language Models: Addressing Bias, Misinformation, and Prompt Attacks Benji Peng et.al. 2409.08087 null
2024-09-06 Recent Advances in Attack and Defense Approaches of Large Language Models Jing Cui et.al. 2409.03274 null
2024-10-11 Safety Layers in Aligned Large Language Models: The Key to LLM Security Shen Li et.al. 2408.17003 null
2024-08-27 Investigating Coverage Criteria in Large Language Models: An In-Depth Study Through Jailbreak Attacks Shide Zhou et.al. 2408.15207 null
2024-09-06 LLM-PBE: Assessing Data Privacy in Large Language Models Qinbin Li et.al. 2408.12787 null
2024-08-21 Against All Odds: Overcoming Typology, Script, and Language Confusion in Multilingual Embedding Inversion Attacks Yiyi Chen et.al. 2408.11749 link
2024-07-11 Virtual Context: Enhancing Jailbreak Attacks with Special Token Injection Yuqi Zhou et.al. 2406.19845 null
2024-06-26 Natural Language but Omitted? On the Ineffectiveness of Large Language Models' privacy policy from End-users' Perspective Shuning Zhang et.al. 2406.18100 null
2024-06-24 Noisy Neighbors: Efficient membership inference attacks against LLMs Filippo Galli et.al. 2406.16565 null
2024-06-18 Can We Trust Large Language Models Generated Code? A Framework for In-Context Learning, Security Patterns, and Code Evaluations Across Diverse LLMs Ahmad Mohsin et.al. 2406.12513 null
2024-06-17 Self and Cross-Model Distillation for LLMs: Effective Methods for Refusal Pattern Alignment Jie Li et.al. 2406.11285 null
2024-06-16 garak: A Framework for Security Probing Large Language Models Leon Derczynski et.al. 2406.11036 link
2024-06-06 AutoJailbreak: Exploring Jailbreak Attacks and Defenses through a Dependency Lens Lin Lu et.al. 2406.03805 null
2024-05-25 FastQuery: Communication-efficient Embedding Table Query for Private LLM Inference Chenqi Lin et.al. 2405.16241 null
2024-05-24 Hacc-Man: An Arcade Game for Jailbreaking LLMs Matheus Valentim et.al. 2405.15902 null
2024-06-13 SecureLLM: Using Compositionality to Build Provably Secure Language Models for Private, Sensitive, and Secret Data Abdulrahman Alabdulkareem et.al. 2405.09805 link
2024-05-03 LLM Security Guard for Code Arya Kavian et.al. 2405.01103 link
2024-04-19 CyberSecEval 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language Models Manish Bhatt et.al. 2404.13161 link
2024-11-04 Private Attribute Inference from Images with Vision-Language Models Batuhan Tömekçe et.al. 2404.10618 null
2024-04-16 Unveiling the Misuse Potential of Base Large Language Models via In-Context Learning Xiao Wang et.al. 2404.10552 null
2024-04-12 Subtoxic Questions: Dive Into Attitude Change of LLM's Response in Jailbreak Attempts Tianyu Zhang et.al. 2404.08309 null
2024-03-20 Mapping LLM Security Landscapes: A Comprehensive Stakeholder Risk Assessment Proposal Rahul Pankajakshan et.al. 2403.13309 null
2024-03-23 Ensuring Safe and High-Quality Outputs: A Guideline Library Approach for Language Models Yi Luo et.al. 2403.11838 link
2024-03-13 Tastle: Distract Large Language Models for Automatic Jailbreak Attack Zeguan Xiao et.al. 2403.08424 link
2024-03-14 On Protecting the Data Privacy of Large Language Models (LLMs): A Survey Biwei Yan et.al. 2403.05156 null
2024-02-28 A New Era in LLM Security: Exploring Security Concerns in Real-World LLM-based Systems Fangzhou Wu et.al. 2402.18649 null
2024-06-10 Making Them Ask and Answer: Jailbreaking Large Language Models in Few Queries via Disguise and Reconstruction Tong Liu et.al. 2402.18104 link
2024-06-04 ASETF: A Novel Method for Jailbreak Attack on LLMs through Translate Suffix Embeddings Hao Wang et.al. 2402.16006 null
2024-06-18 Is the System Message Really Important to Jailbreaks in Large Language Models? Xiaotian Zou et.al. 2402.14857 null
2024-05-17 A Comprehensive Study of Jailbreak Attack versus Defense for Large Language Models Zihao Xu et.al. 2402.13457 link
2024-09-25 StruQ: Defending Against Prompt Injection with Structured Queries Sizhe Chen et.al. 2402.06363 link
2024-10-31 Fight Back Against Jailbreaking via Prompt Adversarial Tuning Yichuan Mo et.al. 2402.06255 link
2024-06-05 Text Embedding Inversion Security for Multilingual Language Models Yiyi Chen et.al. 2401.12192 link
2024-11-06 ConfusionPrompt: Practical Private Inference for Online Large Language Models Peihua Mai et.al. 2401.00870 null
2023-12-18 A Comprehensive Survey of Attack Techniques, Implementation, and Mitigation Strategies in Large Language Models Aysan Esmradi et.al. 2312.10982 null
2024-03-20 A Survey on Large Language Model (LLM) Security and Privacy: The Good, the Bad, and the Ugly Yifan Yao et.al. 2312.02003 null
2023-10-16 Prompt Packer: Deceiving LLMs through Compositional Instruction with Hidden Attacks Shuyu Jiang et.al. 2310.10077 null
2024-05-06 Beyond Memorization: Violating Privacy Via Inference with Large Language Models Robin Staab et.al. 2310.07298 link
2023-09-04 Baseline Defenses for Adversarial Attacks Against Aligned Language Models Neel Jain et.al. 2309.00614 null
2023-11-01 Multi-step Jailbreaking Privacy Attacks on ChatGPT Haoran Li et.al. 2304.05197 link

(back to top)

LLM Agent Security & Privacy

Publish Date Title Authors PDF Code
2024-11-01 Defense Against Prompt Injection Attack by Leveraging Attack Techniques Yulin Chen et.al. 2411.00459 null
2024-11-01 Attention Tracker: Detecting Prompt Injection Attacks in LLMs Kuo-Han Hung et.al. 2411.00348 null
2024-10-28 Systematically Analyzing Prompt Injection Vulnerabilities in Diverse LLM Architectures Victoria Benjamin et.al. 2410.23308 null
2024-10-30 InjecGuard: Benchmarking and Mitigating Over-defense in Prompt Injection Guardrail Models Hao Li et.al. 2410.22770 link
2024-10-29 Embedding-based classifiers can detect prompt injection attacks Md. Ahsan Ayub et.al. 2410.22284 link
2024-10-28 FATH: Authentication-based Test-time Defense against Indirect Prompt Injection Attacks Jiongxiao Wang et.al. 2410.21492 link
2024-10-28 Fine-tuned Large Language Models (LLMs): Improved Prompt Injection Attacks Detection Md Abdur Rahman et.al. 2410.21337 null
2024-10-27 LLM Robustness Against Misinformation in Biomedical Question Answering Alexander Bondarenko et.al. 2410.21330 link
2024-10-28 Palisade -- Prompt Injection Detection Framework Sahasra Kokkula et.al. 2410.21146 null
2024-10-22 Breaking ReAct Agents: Foot-in-the-Door Attack Will Get You In Itay Nakash et.al. 2410.16950 null
2024-10-21 SMILES-Prompting: A Novel Approach to LLM Jailbreak Attacks in Chemical Synthesis Aidan Wong et.al. 2410.15641 link
2024-10-18 Making LLMs Vulnerable to Prompt Injection via Poisoning Alignment Zedian Shao et.al. 2410.14827 link
2024-10-18 Backdoored Retrievers for Prompt Injection Attacks on Retrieval Augmented Generation of Large Language Models Cody Clop et.al. 2410.14479 null
2024-10-09 Prompt Infection: LLM-to-LLM Prompt Injection within Multi-Agent Systems Donghyun Lee et.al. 2410.07283 null
2024-10-07 Aligning LLMs to Be Robust Against Prompt Injection Sizhe Chen et.al. 2410.05451 link
2024-10-07 A test suite of prompt injection attacks for LLM-based machine translation Antonio Valerio Miceli-Barone et.al. 2410.05047 link
2024-10-03 Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents Hanrong Zhang et.al. 2410.02644 link
2024-09-29 GenTel-Safe: A Unified Benchmark and Shielding Framework for Defending Against Prompt Injection Attacks Rongchang Li et.al. 2409.19521 null
2024-10-10 System-Level Defense against Indirect Prompt Injection Attacks: An Information Flow Control Perspective Fangzhou Wu et.al. 2409.19091 link
2024-09-23 PROMPTFUZZ: Harnessing Fuzzing Techniques for Robust Testing of Prompt Injection in LLMs Jiahao Yu et.al. 2409.14729 link
2024-09-20 Applying Pre-trained Multilingual BERT in Embeddings for Improved Malicious Prompt Injection Attacks Detection Md Abdur Rahman et.al. 2409.13331 null
2024-08-08 FDI: Attack Neural Code Generation Systems through User Feedback Channel Zhensu Sun et.al. 2408.04194 link
2024-09-09 A Study on Prompt Injection Attack Against LLM-Integrated Mobile Robotic Systems Wenxiao Zhang et.al. 2408.03515 null
2024-08-01 WHITE PAPER: A Brief Exploration of Data Exfiltration using GCG Suffixes Victor Valbuena et.al. 2408.00925 null
2024-07-23 Prompt Injection Attacks on Large Language Models in Oncology Jan Clusmann et.al. 2407.18981 null
2024-07-12 A Survey of Attacks on Large Vision-Language Models: Resources, Advances, and Future Trends Daizong Liu et.al. 2407.07403 link
2024-06-20 Prompt Injection Attacks in Defended Systems Daniil Khomsky et.al. 2406.14048 null
2024-07-18 AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents Edoardo Debenedetti et.al. 2406.13352 link
2024-06-11 Knowledge Return Oriented Prompting (KROP) Jason Martin et.al. 2406.11880 null
2024-09-05 SelfDefend: LLMs Can Defend Themselves against Jailbreaking in a Practical Manner Xunguang Wang et.al. 2406.05498 null
2024-09-25 Ranking Manipulation for Conversational Search Engines Samuel Pfrommer et.al. 2406.03589 link
2024-11-03 Are you still on track!? Catching LLM Task Drift with Activations Sahar Abdelnabi et.al. 2406.00799 link
2024-06-06 Exfiltration of personal information from ChatGPT via prompt injection Gregory Schwartzman et.al. 2406.00199 null
2024-05-31 Preemptive Answer "Attacks" on Chain-of-Thought Reasoning Rongwu Xu et.al. 2405.20902 null
2024-09-24 Goal-guided Generative Prompt Injection Attack on Large Language Models Chong Zhang et.al. 2404.07234 null
2024-07-29 Fine-Tuning, Quantization, and LLMs: Navigating Unintended Outcomes Divyanshu Kumar et.al. 2404.04392 null
2024-08-24 Optimization-based Prompt Injection Attack to LLM-as-a-Judge Jiawen Shi et.al. 2403.17710 null
2024-03-20 Defending Against Indirect Prompt Injection Attacks With Spotlighting Keegan Hines et.al. 2403.14720 null
2024-03-14 Scaling Behavior of Machine Translation with Large Language Models under Prompt Injection Attacks Zhifan Sun et.al. 2403.09832 link
2024-03-07 Automatic and Universal Prompt Injection Attacks against Large Language Models Xiaogeng Liu et.al. 2403.04957 link
2024-05-02 Neural Exec: Learning (and Learning from) Execution Triggers for Prompt Injection Attacks Dario Pasquini et.al. 2403.03792 link
2024-02-16 The AI Security Pyramid of Pain Chris M. Ward et.al. 2402.11082 null
2024-02-15 AbuseGPT: Abuse of Generative AI ChatBots to Create Smishing Campaigns Ashfak Md Shibli et.al. 2402.09728 null
2024-09-25 StruQ: Defending Against Prompt Injection with Structured Queries Sizhe Chen et.al. 2402.06363 link
2024-02-08 In-Context Learning Can Re-learn Forbidden Tasks Sophie Xhonneux et.al. 2402.05723 null
2024-01-31 An Early Categorization of Prompt Injection Attacks on Large Language Models Sippo Rossi et.al. 2402.00898 null
2024-10-15 Mitigating the Influence of Distractor Tasks in LMs with Prior-Aware Decoding Raymond Douglas et.al. 2401.17692 null
2024-01-15 Signed-Prompt: A New Approach to Prevent Prompt Injection Attacks Against LLM-Integrated Applications Xuchen Suo et.al. 2401.07612 null
2024-01-02 A Novel Evaluation Framework for Assessing Resilience Against Prompt Injection Attacks in Large Language Models Daniel Wankit Yip et.al. 2401.00991 null
2024-01-08 Jatmo: Prompt Injection Defense by Task-Specific Finetuning Julien Piet et.al. 2312.17673 link
2024-03-08 Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models Jingwei Yi et.al. 2312.14197 link
2023-12-12 Maatphor: Automated Variant Analysis for Prompt Injection Attacks Ahmed Salem et.al. 2312.11513 null
2024-05-25 Assessing Prompt Injection Risks in 200+ Custom GPTs Jiahao Yu et.al. 2311.11538 link
2023-11-02 Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game Sam Toyer et.al. 2311.01011 null
2024-06-01 Formalizing and Benchmarking Prompt Injection Attacks and Defenses Yupei Liu et.al. 2310.12815 link
2023-11-25 Evaluating the Instruction-Following Robustness of Large Language Models to Prompt Injection Zekun Li et.al. 2308.10819 link
2023-07-03 From ChatGPT to ThreatGPT: Impact of Generative AI in Cybersecurity and Privacy Maanak Gupta et.al. 2307.00691 null
2024-03-02 Prompt Injection attack against LLM-integrated Applications Yi Liu et.al. 2306.05499 null

(back to top)

Audio Deepfake

Publish Date Title Authors PDF Code
2024-10-13 Prompt Tuning for Audio Deepfake Detection: Computationally Efficient Test-time Domain Adaptation with Limited Target Dataset Hideyuki Oiso et.al. 2410.09869 link
2024-10-09 Toward Robust Real-World Audio Deepfake Detection: Closing the Explainability Gap Georgia Channing et.al. 2410.07436 null
2024-10-09 Learn from Real: Reality Defender's Submission to ASVspoof5 Challenge Yi Zhu et.al. 2410.07379 null
2024-09-24 Representation Loss Minimization with Randomized Selection Strategy for Efficient Environmental Fake Audio Detection Orchid Chetia Phukan et.al. 2409.15767 null
2024-09-21 Strong Alone, Stronger Together: Synergizing Modality-Binding Foundation Models with Optimal Transport for Non-Verbal Emotion Recognition Orchid Chetia Phukan et.al. 2409.14221 null
2024-09-14 SafeEar: Content Privacy-Preserving Audio Deepfake Detection Xinfeng Li et.al. 2409.09272 link
2024-09-09 Continuous Learning of Transformer-based Audio Deepfake Detection Tuan Duy Nguyen Le et.al. 2409.05924 null
2024-08-20 Does Current Deepfake Audio Detection Model Effectively Detect ALM-based Deepfake Audio? Yuankun Xie et.al. 2408.10853 link
2024-08-14 WavLM model ensemble for audio deepfake detection David Combei et.al. 2408.07414 null
2024-08-13 Temporal Variability and Multi-Viewed Self-Supervised Representations to Tackle the ASVspoof5 Deepfake Challenge Yuankun Xie et.al. 2408.06922 null
2024-09-12 ADD 2023: Towards Audio Deepfake Detection and Analysis in the Wild Jiangyan Yi et.al. 2408.04967 null
2024-07-26 SLIM: Style-Linguistics Mismatch Model for Generalized Audio Deepfake Detection Yi Zhu et.al. 2407.18517 null
2024-07-10 Targeted Augmented Data for Audio Deepfake Detection Marcella Astrid et.al. 2407.07598 null
2024-07-01 Deepfake Audio Detection Using Spectrogram-based Feature and Ensemble of Deep Learning Models Lam Pham et.al. 2407.01777 null
2024-06-24 One-Class Learning with Adaptive Centroid Shift for Audio Deepfake Detection Hyun Myung Kim et.al. 2406.16716 null
2024-06-12 Codecfake: An Initial Dataset for Detecting LLM-based Deepfake Audio Yi Lu et.al. 2406.08112 null
2024-06-18 RawBMamba: End-to-End Bidirectional State Space Model for Audio Deepfake Detection Yujie Chen et.al. 2406.06086 link
2024-06-12 Harder or Different? Understanding Generalization of Audio Deepfake Detection Nicolas M. Müller et.al. 2406.03512 null
2024-06-09 Generalized Source Tracing: Detecting Novel Audio Deepfake Algorithm with Real Emphasis and Fake Dispersion Strategy Yuankun Xie et.al. 2406.03240 null
2024-08-13 Towards Robust Audio Deepfake Detection: A Evolving Benchmark for Continual Learning Xiaohui Zhang et.al. 2405.08596 link
2024-05-15 The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio Yuankun Xie et.al. 2405.04880 link
2024-07-01 Training-Free Deepfake Voice Recognition by Leveraging Large-Scale Pre-Trained Models Alessandro Pianese et.al. 2405.02179 null
2024-04-24 CLAD: Robust Audio Deepfake Detection Against Manipulation Attacks with Contrastive Learning Haolin Wu et.al. 2404.15854 link
2024-04-23 Retrieval-Augmented Audio Deepfake Detection Zuheng Kang et.al. 2404.13892 null
2024-04-19 Enhancing Generalization in Audio Deepfake Detection: A Neural Collapse based Sampling and Training Approach Mohammed Yousif et.al. 2404.13008 null
2024-09-20 Cross-Domain Audio Deepfake Detection: Dataset and Analysis Yuang Li et.al. 2404.04904 null
2024-03-31 Heterogeneity over Homogeneity: Investigating Multilingual Speech Pre-Trained Models for Detecting Audio Deepfake Orchid Chetia Phukan et.al. 2404.00809 link
2024-03-21 Exploring Green AI for Audio Deepfake Detection Subhajit Saha et.al. 2403.14290 link
2024-03-04 A robust audio deepfake detection system via multi-view feature Yujie Yang et.al. 2403.01960 null
2023-12-15 What to Remember: Self-Adaptive Continual Learning for Audio Deepfake Detection Xiaohui Zhang et.al. 2312.09651 null
2024-01-10 Audio Deepfake Detection with Self-Supervised WavLM and Multi-Fusion Attentive Classifier Yinlin Guo et.al. 2312.08089 null
2023-10-05 Securing Voice Biometrics: One-Shot Learning Approach for Audio Deepfake Detection Awais Khan et.al. 2310.03856 null
2023-09-15 HM-Conformer: A Conformer-based audio deepfake detection system with hierarchical pooling and multi-level classification token aggregation methods Hyun-seo Shin et.al. 2309.08208 link
2024-06-12 Towards generalisable and calibrated synthetic speech detection with self-supervised representations Octavian Pascu et.al. 2309.05384 null
2023-09-06 FSD: An Initial Chinese Dataset for Fake Song Detection Yuankun Xie et.al. 2309.02232 link
2023-08-29 Audio Deepfake Detection: A Survey Jiangyan Yi et.al. 2308.14970 null
2023-08-22 Complex-valued neural networks for voice anti-spoofing Nicolas M. Müller et.al. 2308.11800 null
2023-08-20 The DKU-DUKEECE System for the Manipulation Region Location Task of ADD 2023 Zexin Cai et.al. 2308.10281 null
2023-06-27 TranssionADD: A multi-frame reinforcement based sequence tagging model for audio deepfake detection Jie Liu et.al. 2306.15212 null
2023-05-30 Pseudo-Siamese Network based Timbre-reserved Black-box Adversarial Attack in Speaker Identification Qing Wang et.al. 2305.19020 null
2023-05-25 Betray Oneself: A Novel Audio DeepFake Detection Model via Mono-to-Stereo Conversion Rui Liu et.al. 2305.16353 link
2023-05-23 ADD 2023: the Second Audio Deepfake Detection Challenge Jiangyan Yi et.al. 2305.13774 null
2023-06-10 Defense Against Adversarial Attacks on Audio DeepFake Detection Piotr Kawa et.al. 2212.14597 link
2022-10-12 SpecRNet: Towards Faster and More Accessible Audio DeepFake Detection Piotr Kawa et.al. 2210.06105 link
2022-08-02 Audio Deepfake Detection Based on a Combination of F0 Information and Real Plus Imaginary Spectrogram Features Jun Xue et.al. 2208.01214 null
2022-07-21 Attack Agnostic Dataset: Towards Generalization and Stabilization of Audio DeepFake Detection Piotr Kawa et.al. 2206.13979 link
2024-08-27 Does Audio Deepfake Detection Generalize? Nicolas M. Müller et.al. 2203.16263 null
2022-03-03 The Vicomtech Audio Deepfake Detection System based on Wav2Vec2 for the 2022 ADD Challenge Juan M. Martín-Doñas et.al. 2203.01573 null
2024-07-02 ADD 2022: the First Audio Deep Synthesis Detection Challenge Jiangyan Yi et.al. 2202.08433 null
2021-11-04 WaveFake: A Data Set to Facilitate Audio Deepfake Detection Joel Frank et.al. 2111.02813 link
2021-06-26 Generalized Spoofing Detection Inspired from Audio Generation Artifacts Yang Gao et.al. 2104.04111 null

(back to top)

About

No description, website, or topics provided.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages