Data granularity and retention
Aggregation of data values
The Monitor Service collects a variety of data, including user session usage, user logon performance details, session load balancing details, and connection and machine failure information. Data is aggregated differently depending on its category. Understanding the aggregation of data values presented using the OData Method APIs is critical to interpreting the data. For example:
- Connected Sessions and Machine Failures occur over a period of time. Therefore, they are exposed as maximums over a time period.
- LogOn Duration is a measure of the length of time, therefore is exposed as an average over a time period.
- LogOn Count and Connection Failures are counts of occurrences over a period of time, therefore are exposed as sums over a time period.
Concurrent data evaluation
Sessions must be overlapping to be considered concurrent. However, when the time interval is 1 minute, all sessions in that minute (whether or not they overlap) are considered concurrent: the size of the interval is so small that the performance overhead involved in calculating the precision is not worth the value added. If the sessions occur in the same hour, but not in the same minute, they are not considered to overlap.
Correlation of summary tables with raw data
The data model represents metrics in two different ways.:
- The summary tables represent aggregate views of the metrics in per minute, hour, and day time granularities.
- The raw data represents individual events or current state tracked in the session, connection, application and other objects.
When attempting to correlate data across API calls or within the data model itself, it is important to understand the following concepts and limitations:
- No summary data for partial intervals. Metrics summaries are designed to meet the needs of historical trends over long periods of time. These metrics are aggregated into the summary table for complete intervals. There will be no summary data for a partial interval at the beginning (oldest available data) of the data collection nor at the end. When viewing aggregations of a day (Interval=1440), this means that the first and most recent incomplete days will have no data. Although raw data may exist for those partial intervals, it will never be summarized. You can determine the earliest and latest aggregate interval for a particular data granularity by pulling the min and max SummaryDate from a particular summary table. The SummaryDate column represents the start of the interval. The Granularity column represents the length of the interval for the aggregate data.
- Correlating by time. Metrics are aggregated into the summary table for complete intervals as described above. They can be used for historical trends, but raw events may be more current in the state than what has been summarized for trend analysis. Any time-based comparison of summary to raw data needs to take into account that there will be no summary data for partial intervals that may occur or for the beginning and ending of the time period.
- Missed and latent events. Metrics that are aggregated into the summary table may be slightly inaccurate if events are missed or latent to the aggregation period. Although the Monitor Service attempts to maintain an accurate current state, it does not go back in time to recompute aggregation in the summary tables for missed or latent events.
- Connection High Availability. During connection HA there will be gaps in the summary data counts of current connections, but the session instances will still be running in the raw data.
- Data retention periods. Data in the summary tables is retained on a different grooming schedule from the schedule for raw event data. Data may be missing because it has been groomed away from summary or raw tables. Retention periods may also differ for different granularities of summary data. Lower granularity data (minutes) is groomed more quickly than higher granularity data (days). If data is missing from one granularity due to grooming, it may be found in a higher granularity. Since the API calls only return the specific granularity requested, receiving no data for one granularity does not mean the data doesn’t exist for a higher granularity for the same time period.
- Time zones. Metrics are stored with UTC time stamps. Summary tables are aggregated on hourly time zone boundaries. For time zones that don’t fall on hourly boundaries, there may be some discrepancy as to where data is aggregated.
Granularity and retention
The granularity of aggregated data retrieved by Director is a function of the time (T) span requested. The rules are as follows:
- 0 < T <= 1 hour uses per-minute granularity
- 0 < T <= 30 days uses per-hour granularity
- T > 31 days uses per-day granularity
Requested data that does not come from aggregated data comes from the raw Session and Connection information. This data tends to grow fast, and therefore has its own grooming setting. Grooming ensures that only relevant data is kept long term. This ensures better performance while maintaining the granularity required for reporting. Customers on Platinum licensed Sites can change the grooming retention to their desired number of retention days, otherwise the default is used.
To access the settings, run the following PowerShell commands on the Delivery Controller:
asnp Citrix.*
Get-MonitorConfiguration
Set-MonitorConfiguration -<setting name> <value>
<!--NeedCopy-->
The following settings are used to control grooming:
Setting name | Affected grooming | Default value Platinum (days) | Default value non-Platinum (days) | ||
---|---|---|---|---|---|
1 | GroomSessionsRetentionDays | Session and Connection records retention after Session termination | 90 | 7 | |
2 | GroomFailuresRetentionDays | MachineFailureLog and ConnectionFailureLog records | 90 | 7 | |
3 | GroomLoadIndexesRetentionDays | LoadIndex records | 90 | 7 | |
4 | GroomDeletedRetentionDays | Machine, Catalog, DesktopGroup, and Hypervisor entities that have a LifecycleState of ‘Deleted’. This also deletes any related Session, SessionDetail, Summary, Failure, or LoadIndex records. | 90 | 7 | |
5 | GroomSummariesRetentionDays | DesktopGroupSummary, FailureLogSummary, and LoadIndexSummary records. Aggregated data - daily granularity. | 90 | 7 | |
6 | GroomMachineHotfixLogRetentionDays | Hotfixes applied to the VDA and Controller machines | 90 | 90 | |
7 | GroomMinuteRetentionDays | Aggregated data - minute granularity | 3 | 3 | |
8 | GroomHourlyRetentionDays | Aggregated data - hourly granularity | 32 | 7 | |
9 | GroomApplicationInstanceRetentionDays | Application Instance history | 90 | 0 | |
10 | GroomNotificationLogRetentionDays | Notification Log records | 90 | ||
11 | GroomResourceUsageRawDataRetentionDays | Resource utilization data - raw data | 1 | 1 | |
12 | GroomResourceUsageMinuteDataRetentionDays | Resource utilization summary data - minute granularity | 7 | 7 | |
13 | GroomResourceUsageHourDataRetentionDays | Resource utilization summary data - hour granularity | 30 | 7 | |
14 | GroomResourceUsageDayDataRetentionDays | Resource utilization summary data - day granularity | 90 | 7 | |
15 | GroomProcessUsageRawDataRetentionDays | Process utilization data - raw data | 1 | 1 | |
16 | GroomProcessUsageMinuteDataRetentionDays | Process utilization data - minute granularity | 3 | 3 | |
17 | GroomProcessUsageHourDataRetentionDays | Process utilization data - hour granularity | 7 | 7 | |
18 | GroomProcessUsageDayDataRetentionDays | Process utilization data - day granularity | 30 | 7 | |
19 | GroomSessionMetricsDataRetentionDays | Session metrics data | 1 | 1 | |
20 | GroomMachineMetricDataRetentionDays | Machine metrics data | 3 | 3 | |
21 | GroomMachineMetricDaySummaryDataRetentionDays | Machine metrics summary data | 90 | 7 | |
22 | GroomApplicationErrorsRetentionDays | Application error data | 1 | 1 | |
23 | GroomApplicationFaultsRetentionDays | Application failure data | 1 | 1 |
Caution: Modifying values on the Monitor Service database requires restarting the service for the new values to take effect. You are advised to make changes to the Monitor Service database only under the direction of Citrix Support.
Notes on grooming retention: GroomProcessUsageRawDataRetentionDays, GroomResourceUsageRawDataRetentionDays, and GroomSessionMetricsDataRetentionDays are limited to their default values of 1, while GroomProcessUsageMinuteDataRetentionDays is limited to its default value of 3. The PowerShell commands to set these values have been disabled, as the process usage data tends to grow quickly. Additionally, license based retention settings are as follows:
- Premium licensed Sites - you can update the grooming retention settings above to any number of days.
- Advanced licensed Sites - the grooming retention for all settings is limited to 31 days.
- All other Sites - the grooming retention for all settings is limited to 7 days.
Exceptions:
- GroomApplicationInstanceRetentionDays can be set only in Premium licensed Sites.
- GroomApplicationErrorsRetentionDays and GroomApplicationFaultsRetentionDays are limited to 31 days in Premium licensed Sites.
Retaining data for long periods will have the following implications on table sizes:
-
Hourly data. If hourly data is allowed to stay in the database for up to two years, a site of 1000 delivery groups could cause the database to grow as follows:
1000 delivery groups x 24 hours/day x 365 days/year x 2 years = 17,520,000 rows of data. The performance impact of such a large amount of data in the aggregation tables is significant. Given that the dashboard data is drawn from this table, the requirements on the database server may be large. Excessively large amounts of data may have a dramatic impact on performance.
-
Session and event data. This is the data that is collected every time a session is started and a connection/reconnection is made. For a large site (100K users), this data will grow very fast. For example, two years’ worth of these tables would gather more than a TB of data, requiring a high-end enterprise-level database.