systematic-debugging
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHCREDENTIALS_UNSAFECOMMAND_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
- CREDENTIALS_UNSAFE (HIGH): The skill direc t s the agent to log s en s it ive e n v i ronment v a r iab l e s a n d s y s t em s ec u r it y s t a t e s. E v i d e n c e: Phase 1 inc l u d e s exam p l e s l ike 'en v | g r e p IDENTITY' (expos ing p o t e n t ial s ec r e t s) a n d m a cOS 's ec u r it y l is t -keyc h a in s' (expos ing hos t s ec u r it y conf ig u r a t ion s).
- In d i rec t Prompt Injec t ion (HIGH): The skill man d a t e s a c t ing o n u n t r u s t e d e x t e r n a l d a t a (log s a n d e r ror s) while g r a n t ing the agent the c ap a b il it y to g e n e r a t e a n d e x ec u t e shel l comman d s. * Inges t ion p o in t s: E r ror me s s age s, s t a c k t r a c e s, a n d c om p o n e n t log s. * Bou n d a r y mar k e r s: Ab s e n t. * Cap a b il it y in v e n t o r y: High (agent -ge n e r a t e d b a s h e x ec u t ion). * San it iz a t ion: Ab s e n t.
- COMMAND_EXECUTION (MEDIUM): The 'Iron Law' f r am ing p r e s s u r e s the agent to p e r f o rm in s t r u me n t a t ion v ia shel l comman d s, whic h b y p a s s e s t y p ic al c a u t ion whe n h a n d l ing e x t e r n a l c o n t e x t.
Recommendations
- AI detected serious security threats
Audit Metadata