There’re peculiar cases where
we need to analyze the server logs, Event tracing & Alert Tracking.
Analyzing the server logs is a heavy burden for Application Administrators
& Analysts. In order to make it easy for them to incident tracked
& responded back. There must a strategy streamlined for the operational
activities, Every project is set to have the Change Management, Incident
Management, Problem Management to control & keep track the operational
activities, I name few tools (HPSD, Service-Now). But there should be
technology strategy at the Enterprise Level to keep track
the events, incidents & for Log Management or Error Logging covering
all the tools & tech stacks, Here is one tool came up for log
management (Splunk).
Splunk help us to see the errors, alerts, incidents in a
visualized dashboard which help us to understand at a glance What’s happening
in the infrastructure. It’s not just the better utilization of human resources
that reduces the rework but it’s the optimum utilization of technology
components that & reducing the human intervention. The
fundamental re-usability is to start utilizing the templates
even for emails, communication mailers rather than rewriting every time.
The capability of the shell script is an utmost robust component
which helps to reduce the human intervention in most of Technology
Infrastructure projects. Even for the Disaster Recovery we built our own customs
scripts to replicate the complete infrastructure from Production to DR
& DR to Production environment. Investing in licensing for third party
tool is again a cost sensitive for the projects.
Cost effective is to write the custom scripts for the below activities once you
write the scripts, tested it & schedule it through your cron scheduler
you are all set, that’s it 50% of the burden would be reduced & it
would be a great relief to administrators. Set the email alert in each of the
scripts to generate the reports.
- CPU Usage on the server : Script to monitor the CPU usage on the server.
- RAM Memory consumed by the application : Hourly script which triggers the alert whenever the memory usage on the server is high.
- Storage used by app /limit on the server : Hourly script which triggers the alert whenever the server storage utilization is high.
- Application monitoring (Script to send an email alert when the app is down).
- Backup scripts for the app related data stores : Scheduled daily script that backups the repository.
- Scripts for code deployments & release – Everything should be done through scripts & logs should be generated for each releases & deployments.
- Effective use of version control.
- Hourly scripts which would generates reports from repository on Failed, Aborted, Succeeded jobs also on INSERTED, REJECTED Records.
- Scripts for Data Store Connections : Connection creations would be done through the script with this naming standards would streamlined effectively.
- Weekly Script to release the space on the servers: Which truncates the older logs & files which are no longer needed.
No comments:
Post a Comment