-
HPC Systems Administrator (IT@JH Research…
- Johns Hopkins University (Baltimore, MD)
-
IT@JH Research Computing is seeking a **_HPC_** **Systems Administrator** who will support the daily operation and upkeep of Johns Hopkins University’s high-performance computing & AI environments. This role helps maintain the reliability and availability of compute, storage, and network resources used by researchers across campus. Core responsibilities include monitoring basic system health, assisting with user account administration, installing software updates, and helping with node configuration under the direction of senior staff. Work involves resolving tickets, performing routine maintenance, and participating in infrastructure projects as guided by experienced engineers. The position works closely with Sr. HPC team members to ensure stable, well-documented systems and to provide consistent, dependable support for the university’s computational research.
Specific Duties & Responsibilities
+ The responsibilities listed below are typical examples of the work performed by this position.
+ Not all duties assigned to this position are included, nor is it expected that everyone in this position will be assigned every job responsibility.
_Systems Analysis/Design (Environment/Platform)_
+ With guidance and direction, design simple business, clinical, education, or infrastructure solutions by meeting with customers to observe and understand current processes and the issue related to those processes. Provide written documentation and diagrams of findings to share with the client and other IT colleagues.
+ Design simple solutions that conform to institutional policies, standards, and guidelines, and infrastructure environment and to vendor and industry best practices to deliver a quality product.
+ Participates in the selection of infrastructure applications that reside between end user applications and hardware operating systems by working with vendors, customers, and other sources (i.e., open source or Internet2 initiatives) to provide configurable tools to the customers.
_Install & Configure_
+ Install and configure basic server hardware and operating systems by following technical documentation to provide a working product.
+ Evaluate, implement, and monitor appropriate basic software and hardware solutions by using best practices for the environment to ensure system integrity.
+ Install and configure infrastructure applications by following product installation and configuration directions and industry best practices to deliver a solution to the customers.
+ Implement a schedule of system backups and archive operations by using best practices for the environment to ensure data/media recoverability.
_Maintain & Troubleshoot_
+ Provide basic server level administration (manage HW/SW, maintenance, upgrades and patches, account maintenance, backups and recoveries and assist users) by following documented procedures to ensure a stable environment.
+ Monitor and tune the system by following documentation and procedures to achieve optimum performance levels.
+ Develop basic scripts and solutions by using departmental standards to automate systems management.
+ With guidance and direction, perform basic system software upgrades including planning and scheduling, testing, and coordination by following documentation and departmental standards to provide a stable product for the environment.
+ Audit and maintain user access and authorization by following access and authorization documentation to provide for system security.
+ Generate and maintain periodic and ongoing system specific reports by using appropriate tools to assess system performance, integrity and capacity in order to deliver a stable environment to the users.
+ Follow and maintain IT security awareness and best practices by understanding security principles as they pertain to environments supported in order to deliver secure solutions to customers.
+ Utilize system management and monitoring tools and incident tracking systems by following documentation and standards to detect incidents and take corrective actions. Participates in determining root cause.
+ Monitor changes and resolve routine incidents by responding to problems as they occur, by reviewing all processing and output of the newly implemented solution, and by proactively ensuring the solution works successfully in order to satisfy the customer requirements and to provide a smooth transition to the new solution.
_Project Collaboration & Lifecycle Participation_
+ With guidance and direction, implement changes while adhering to the change management policies and procedures in order to deliver a successful solution to the customer. Communicate to all parties the nature, significance, and risk factors.
+ Participate in the evaluation of vendor proposals by reviewing requirements for the product to select the most appropriate vendor.
+ Participate with vendors, consultants, and internal groups in developing applications by meeting with the team on a regular basis to deliver quality products to customers.
+ Participate in scheduled project team meetings by attending all meetings to provide input to the project team.
+ Create and maintain documentation by writing audience-appropriate materials to serve as technical and/or end user reference.
+ Test all changes by using the appropriate test scenarios to ensure all delivered solutions work as expected and errors are handled in a meaningful way. Contribute to the development of test scenarios.
+ Other duties as assigned.
Minimum Qualifications
+ Bachelor’s Degree.
+ One year of related experience.
+ Additional education may substitute for required experience and additional related experience may substitute for required education beyond a high school diploma/graduation equivalent, to the extent permitted by the JHU equivalency formula.
Preferred Qualifications
+ One to three years of experience administering Linux systems.
+ Basic familiarity with high-performance computing concepts such as cluster scheduling, shared filesystems, high-speed networking.
+ Understanding of common Linux tools and workflows, including package management, system logs, permissions, and service management.
+ Introductory experience with scripting (Bash or Python) for automation or troubleshooting tasks.
+ Familiarity with configuration management concepts or tools (e.g., Ansible, Puppet).
+ Familiarity with networking fundamentals (TCP/IP, DNS, SSH) and authentication technologies (LDAP, Active Directory).
+ Strong communication skills with the ability to document procedures and work effectively with researchers and technical staff.
+ Demonstrated eagerness to learn new technologies and contribute to a collaborative team environment.
Expected Skills/Proficiency Level
+ Automation - Intermediate
+ Directory Services - Intermediate
+ Operating Software - Intermediate
+ Scripting - Intermediate
+ Software Development Life Cycle - Intermediate
+ Systems Analysis - Intermediate
+ Systems Configuration - Intermediate
+ Systems Development - Intermediate
+ Systems Integration - Intermediate
Classified Title: Systems Administrator
Job Posting Title (Working Title): HPC Systems Administrator (IT@JH Research Computing)
Role/Level/Range: ATP/03/PC
Starting Salary Range: $53,800 - $94,400 Annually (Commensurate w/exp,)
Employee group: Full Time
Schedule: Mon-Fri, 8:30am-5pm
FLSA Status: Exempt
Department name: IT@JH Research Computing
Personnel area: University Administration
Equal Opportunity Employer
All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or status as a protected veteran.
-
Recent Jobs
-
Office Operations Assistant - Otolaryngology
- Ascension Health (Middleburg, FL)
-
Warehouse Office Support
- Home Depot (Kent, WA)
-
Night Warehouse Associate- Each Line Order Selector
- Performance Food Group (Spokane, WA)
-
Machine Operator
- ManpowerGroup (Auburn Hills, MI)