Job Description
Appealing Points - Work with Cutting-edge Technology: Opportunity to work with modern infrastructure such as AWS, Kubernetes, and microservices, using advanced monitoring tools like Grafana, Loki, and Sentry in a 24/7 operations environment.
- Wide Scope for Skill Development from L1 to L2: Gain experience across various responsibilities like incident management, problem resolution, B2B support, and knowledge sharing, enabling career growth from entry-level (L1) to advanced (L2) roles.
- Bilingual and Global Work Environment: Utilize both Japanese and English in a professional setting, collaborating with global teams such as Rakuten Mobile and enhancing technical and communication skills in an international environment.
Annual Salary: 7 Million and above
Responsibilities:
- 24/7 Monitoring & Alert Management (L1 Focus):
- Proactively monitor application, service health, infrastructure (AWS, Kubernetes), and network alerts across all platforms, including the eSIM service.
- Utilize monitoring tools such as Grafana, Loki, Sentry, AWS CloudWatch, and automated email reports to identify anomalies and incidents.
- Analyze platform behavior and proactively identify potential issues to prevent service disruptions.
- Incident Management & First Response (L1 Focus):
- Act as the primary point of contact for all incoming incidents, whether from monitoring alerts or reported by users/B2B customers.
- Perform initial logging, triaging, prioritization, tracking, and routing of incidents within ticketing systems (Jira, ServiceNow, Telna Ticketing Platform, Zendesk).
- Adhere strictly to defined Service Level Agreements (SLAs) for first response time.
- Perform initial troubleshooting using predefined runbooks and standard operating procedures (SOPs).
- Record events, problems, and their resolutions accurately in logs and ticketing systems.
- Advanced Incident Resolution (L2 Focus):
- Serve as the primary escalation point for incidents unresolvable by L1, providing advanced troubleshooting and diagnosis.
- Resolve incidents within agreed-upon SLAs and timelines, leveraging runbooks, MOPs, and deep technical knowledge.
- Perform deep-dive troubleshooting for application, data, integration, and underlying infrastructure-related problems.
- Analyze logs (application, system, AWS, Kubernetes, microservices) using tools like Loki, Sentry, and AWS CloudWatch to identify root causes.
- Coordinate with other support or dependency groups (internal or Rakuten Mobile's L3/DevOps) when incidents have linkages.
- Communication & Escalation:
- Classify incidents based on severity and impact, escalating critical issues promptly to the L2 team (for L1) or the Incident Manager/Rakuten Mobile's L3/DevOps teams (for L2).
- Direct unresolved issues to the appropriate next level of support personnel.
- Provide timely and professional communication to end-users and stakeholders regarding incident status and resolution.
- Gather and pass on feedback or suggestions from customers to the appropriate internal teams.
Job Qualifications:
- Bachelor of Science (BSc) degree in Computer Science, Information Technology, or a related technical field from a nationally recognized/certified university.
- 3-5+ years of hands-on experience in a technical support, network operations, or IT service desk role, preferably in a 24/7 environment. (This range covers both L1 and L2 expectations).
- Proven experience with advanced configuration and troubleshooting of complex IT systems.
- Strong hands-on experience with AWS cloud infrastructure and monitoring (e.g., EC2, VPC, S3, CloudWatch, Lambda).
- Experience with containerization technologies, especially Kubernetes, and microservices architectures.
- Proficiency with monitoring tools such as Grafana, Prometheus, ELK Stack, Loki, and Sentry.
- Solid understanding of networking concepts, Linux/Unix administration, and analyzing system logs.
- Experience with database querying and basic operations.
- Proficiency in scripting (Shell/Bash, Python) for automation and troubleshooting.
- Experience with ticketing systems (e.g., Jira, ServiceNow, Zendesk).
- Excellent analytical, problem-solving, and critical-thinking skills.
- Strong communication and interpersonal skills, with the ability to explain complex technical issues clearly.
- Outstanding customer service skills and a dedication to delivering a positive customer experience.
- Ability to work effectively within a team and independently in a fast-paced, constantly changing environment, including shift work and on-call rotations for 24/7 coverage.
- High level of accountability, excellent work ethic, and a proactive attitude.
- Excellent written and oral communication skills in English and Japanese.
Preferred qualification:
- Experience with CI/CD pipelines and understanding of DevOps methodologies.
- ITIL Foundation certification.
- AWS certifications (e.g., Solutions Architect Associate, SysOps Administrator Associate).
- Ability to communicate in Japanese is a significant plus, especially for Rakuten accounts.
- Experience with basic fault finding and fault escalation in a network environment.
- Ability to multi-task efficiently and manage competing priorities.
Languages:
- English (Overall - 3 - Advanced), Japanese business level (N2)
About Company
The largest eCommerce company in Japan, and the third-largest eCommerce marketplace company worldwide. The organization provides a variety of consumer and business-focused services including e-commerce, e-reading, travel, banking, securities, credit card, e-money, portal and media, online marketing, and professional sports. The company is expanding globally and currently has operations throughout Asia, Western Europe, and the Americas.
[Measures against passive smoking]
No smoking indoors allowed
Designated smoking area
Job Requirements: Japanese JLPT N2, Jira, ServiceNow, Zendesk, Python, AWS cloud, IT systems, Linux, Unix , Lambda, DevOps
Job Tags