Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The AGILITY Active Probe is designed to automate and optimize network diagnostics by triggering analysis, collecting results, and tracking performance metrics. Here’s how to effectively use this feature in your network diagnostic workflows.

Prerequisites:

  • AGILITY Platform access with appropriate user roles.

  • Basic understanding of network diagnostics, SFTP, and S3 operations.

  • Installed dependencies for the Active Probe (see Dependencies section).

1. Setting Up Active Probe

To begin using the Active Probe, ensure that the following components are set up:

  • AGILITY API Access: Ensure that you have proper access to the AGILITY API endpoints required to collect analysis results. You will need to configure the probe to interact with these endpoints.

  • File Storage Setup: The Active Probe requires access to file storage systems like MinIO or a dedicated storage CoS for storing network capture files (PCAPs) that will be analyzed.

  • Telemetry Setup: Make sure your OpenTelemetry (OTEL) integration is correctly configured for metrics collection and forwarding to your monitoring systems.

2. Configuring the Active Probe

A. Setting the File Locations for Analysis

  1. Define File Sources: Choose the set of files (PCAPs) you want to analyze. These files can either be:

    • Stored in MinIO or

    • Located on dedicated storage.

  2. Selecting Files: Ensure that the files selected for analysis are in a supported format (e.g., PCAP) and accessible to the Active Probe module.

B. Schedule Configuration

The Active Probe can be configured to run at specific times or intervals. You can set up a schedule for triggering the analysis:

  1. Define a Schedule: Specify how often you want the probe to run. You can set a recurring schedule or trigger the analysis on-demand.

  2. Modify the Schedule: Update the schedule via the configuration interface to align with your network’s diagnostic needs.

C. Setting Up the API for Results Collection

Ensure that your AGILITY platform API is configured to provide results for the /v1/analysis and /v1/analysis/{analysis_id}/summary endpoints. The Active Probe will call these endpoints to retrieve the analysis results and compare them with expected outcomes.

3. Running the Active Probe

Once set up, the Active Probe will trigger the analysis automatically or on a defined schedule. Here's how it operates:

...

Trigger the Analysis: When the Active Probe is activated, it triggers the analysis by performing an SFTP or S3 copy operation to initiate the diagnostic process.

Collecting Analysis Results:

...

The Active Probe uses the /v1/analysis endpoint to fetch the analysis results.

...

  • Grafana access for monitoring and visualization.

  • The Active Probe’s dependencies are automatically installed during deployment.

...

Running the Active Probe

Once deployed, the Active Probe runs automatically:

  • Automatic Triggering: The probe triggers diagnostic analysis automatically according to a predefined schedule set during deployment. This schedule is expressed in minutes and is configured via the EXECUTION_SCHEDULE environment variable:

    Code Block
    name: EXECUTION_SCHEDULE value: {{ (index .Values "active-probe" "interval") | default "15" | quote }}

  • Analysis Collection: The Active Probe collects analysis results by accessing AGILITY API endpoints (e.g., /v1/analysis and /v1/analysis/{analysis_id}/summary

...

  • ).

...

  • Result Validation: The probe compares the collected results

...

  • If the status of the analysis is “processing” or “warning”, the probe considers it a success.

  • If no call flows are found, or the status is something other than “processing” or “warning,” the probe logs it as a failure.

...

Incrementing Counters: The probe maintains counters for:

  • Validated Results (successful analyses).

  • Failed Results (failed analyses). These counters are labeled with key identifiers, such as model, pcap_name, call_id, and cause_for_failure.

4. Monitoring and Viewing Results

A. Dashboard for Results

A dedicated dashboard provides a clear view of the Active Probe’s performance. The dashboard displays:

  • Validated Results: The number of successful analysis attempts.

  • Failed Results: The number of failed attemptswith expected outcomes, logging the success or failure based on the analysis status.

Monitoring and Viewing Results

Real-time monitoring and result tracking are integrated seamlessly with AGILITY:

  • Dashboard for Results: A dedicated dashboard shows:

    • Validated Results (successful analyses).

    • Failed Results, with labels such as model, call ID, and cause for failure

    .

...

    • .

...

  • OpenTelemetry Integration

...

The Active Probe sends real-time metrics to OpenTelemetry (OTEL). These metrics are sent via the following labels:

  • : Metrics like active_probe_attempt_count: Tracks the total number of analysis attempts. and active_probe_analysis_count: Tracks the total number of analyses completed successfully.

...

  • are forwarded to your

...

  • monitoring systems,

...

  • providing real-time insights into probe performance.

...

Handling Failures and Retries

If the analysis fails or the probe does not retrieve the expected results, the following logic is implemented:In addition, probe is designed to gracefully handle errors in all stages of operation.

  • Retry Logic: The

...

  • probe automatically retries the analysis

...

  • collection process if initial attempts fail.

  • Failure

...

  • Logging: If the analysis

...

  • cannot be

...

  • completed successfully within the retry window, the

...

  • failure is logged, and

...

  • counters are updated accordingly.

6. Advanced Features

...

Metrics and Analytics

The Active Probe supports Prefect Pipeline Automation. This feature allows you to:

  • Automate Pipeline Creation: Automatically generate Prefect pipelines for the Active Probe’s operations, ensuring seamless integration with your broader diagnostic workflows.

  • Configure Pipelines: Define and schedule the pipelines to automate the process of triggering, collecting, and comparing analysis results without manual intervention.

B. Custom Telemetry Labels

You can customize the telemetry labels used to track and monitor the probe's performance. These labels allow for better classification and filtering of probe results in your monitoring dashboards.

7. Metrics and Analytics

The following metrics are exposed through OpenTelemetryintegrates with OpenTelemetry to provide detailed metrics on network performance:

  • active_probe_attempt_count: The total Total number of analysis attempts made to collect and compare analysis results.

  • active_probe_analysis_count: The total Total number of successful analysis comparisons.

These metrics include relevant labels that provide context to the results, such as model, pcap_name, call_id, and cause_for_failure.

8. Troubleshooting

In case of issues with the Active Probe, the following steps can help identify and resolve problems:

  • Check SFTP/completed analyses.

Additionally, the probe sends the status of each analysis (success, failure, and other statuses) as labels within these metrics, allowing for more granular tracking and analysis of probe activity.

...

Troubleshooting

For troubleshooting, follow these steps:

  • Minio S3 Configuration: Ensure that the file locations are correct and accessible by the Active Probe.

  • Review API Connectivity: Verify that the AGILITY API endpoints are is responding and returning the providing expected data.

  • Monitor Telemetry OutputMonitoring: Check the OpenTelemetry output to ensure that Confirm that OpenTelemetry metrics are being sent correctly forwarded to the monitoring systems.

  • Log Review: Examine Logs: Review logs for errors related to file selection, analysis collection , and comparison failures. Logs can be extended to additional outputs like Loki or OTEL if needed.

Dependencies

Ensure that the following libraries are installed for proper operation:

  • APScheduler==3.10.4

  • requests==2.31.0

  • PyYAML==6.0

  • opentelemetry-api==1.21.0

  • opentelemetry-sdk==1.21.0

  • opentelemetry-exporter-otlp==1.21.0

  • opentelemetry-exporter-otlp-proto-grpc==1.21.0

  • opentelemetry-instrumentation==0.42b0

  • boto3==1.26.96

Conclusion

The AGILITY Active Probe is a powerful tool for ensuring efficient, real-time performance monitoring. With flexible configuration options, automated scheduling, integrated monitoring via OpenTelemetry, and detailed failure tracking, the Active Probe provides a robust solution for maintaining optimal network performance and ensuring high availability of services.

...

  • or comparison failures.

...