Final answer:
In preparation for a white box penetration test on an internally developed data collection application, the client needs to provide source code, architecture and data flow diagrams, varying levels of user credentials, and documentation about the application's functionality and security measures. These details will enable the penetration tester to comprehensively evaluate the application's security and identify any vulnerabilities that could lead to unauthorized access to confidential research data.
Step-by-step explanation:
When scoping a white box penetration test focused on reviewing the security of an internally developed data collection application, the client should provide a wide variety of information to enable thorough testing. This information includes, but may not be limited to, the source code of the application, architecture diagrams, credentials for accessing the application with various levels of privilege, a data flow diagram to understand how information is processed and stored, and any relevant documentation or developer notes that describe the application's intended behavior and security controls. Additionally, it would be helpful to have a list of all endpoints the application interacts with and any dependencies, such as external libraries or services, which could have their own security implications.
Having access to the application's source code allows for a complete review of how it handles input, manages sessions, and accesses the database. This can reveal potential vulnerabilities such as SQL injection, cross-site scripting (XSS), or insecure direct object references. Moreover, architecture diagrams and data flow diagrams give insight into the overall system design and points where sensitive data might be exposed or mishandled. Credentials with various levels of access are critical for testing how the application enforces permissions and to determine if escalation of privileges is possible.