Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of the user-level fault handler #376

Open
wants to merge 28 commits into
base: ppos
Choose a base branch
from

Conversation

WenyuanShao
Copy link
Contributor

Summary of this Pull Request (PR)

This PR is the implementation of a user-level fault handler. The system will make a sinv call to the user level fault handler to get the fault handled when a fault or exception happens. This implementation includes the modification below.

  1. Modified the comp_info and set a bit into the captbl of the comp_info on the top of the invstk to indicate whether this is a sinv call due to a fault/exception or not . Add some additional comments to make this design easier to understand.
  2. Created a new type of sinv cap for the fault handlers to get rid of an extra branch in the fast path.
  3. Expanded the thd_introspect to get the register information in user_level and print these register information when fault happens.
  4. Create an interface for fault handlers so different components can implement fault handlers differently.

Intent for your PR

Choose one (Mandatory):

  • This PR is for a code-review and is intended to get feedback, but not to be pulled yet.
  • This PR is mature, and ready to be integrated into the repo.

Reviewers (Mandatory):

(Specify @<github.com username(s)> of the reviewers. Ex: @user1, @user2)
@gparmer , @hungry-foolish , @phanikishoreg

Code Quality

As part of this pull request, I've considered the following:

Style:

  • Comments adhere to the Style Guide (SG)
  • Spacing adhere's to the SG
  • Naming adhere's to the SG
  • All other aspects of the SG are adhered to, or exceptions are justified in this pull request
  • I have run the auto formatter on my code before submitting this PR (see doc/auto_formatter.md for instructions)

Code Craftsmanship:

  • I've made an attempt to remove all redundant code
  • I've considered ways in which my changes might impact existing code, and cleaned it up
  • I've formatted the code in an effort to make it easier to read (proper error handling, function use, etc...)
  • I've commented appropriately where code is tricky
  • I agree that there is no "throw-away" code, and that code in this PR is of high quality

Testing

I've tested the code using the following test programs (provide list here):

  • micro_booter
  • unit_pingpong
  • unit_schedtests
  • llbooter_test


/* Assembly function for sinv from new component */
extern word_t hypercall_entry_rets_inv(spdid_t cur, int op, word_t arg1, word_t arg2, word_t arg3, word_t *ret2, word_t *ret3);

/* Assembly function for sinv for the fault handlers */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider defining unsigned longs as something else in the future, perhaps ptrint_t. This is for the long term.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why aren't they just word_t. That is pretty architecture specific, isn't? You're also using that for registers among others so!

Copy link
Collaborator

@gparmer gparmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very small changes, and one request to remind me about a design decision.

struct cos_aep_info *child_aep = boot_spd_initaep_get(spdid);
struct cos_compinfo *boot_info = boot_spd_compinfo_curr_get();

fault_regs.ax = cos_introspect(boot_info, child_aep->thd, THD_GET_FAULT_REG0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems hard to make portable. I wonder if we should simply make the register an int that goes through a loop with the loop bound defined by an architecture-specific variable. IP/SP/FP are all exceptions, I'm sure.

I don't think this requires changes now, but should be considered in future iterations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are FAULT_REGXXs? Are they different from the regular REGXXs?

return;
}

void
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a memory address fault? I know general protection faults, and page-faults.

return;
}

void
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a future iteration, I think we want two fault handlers for overflow and underflow. I believe the separate components might want to implement each. TODO in the future.

@@ -406,7 +406,8 @@ cap_thd_switch(struct pt_regs *regs, struct thread *curr, struct thread *next, s
struct cos_cpu_local_info *cos_info)
{
struct next_thdinfo *nti = &cos_info->next_ti;
struct comp_info * next_ci = &(next->invstk[next->invstk_top].comp_info);
/* invstk_top never contains in_fault flag in its captbl. So we can use it as comp_info */
struct comp_info * next_ci = (struct comp_info *)&(next->invstk[next->invstk_top].comp_invstk_info);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there a function to get this? Shouldn't there be?

* 22-23 = comp0 pgtbl root,
* 24-27 = comp0 component,
* 28~(20+2*NCPU) = per core alpha thd
* 4-7 = memory access fault handler,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, this is a generalization of memory faults. Sorry for the bad memory.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:-( Perhaps a comment somewhere on what memory access fault handler is and what it generalizes would be useful. I'm still making my guesses here and that probably is not right!

* the return protocol for each is different.
* Thus, in this structure we use the lowest-order bit in the captbl to store
* a flag which indicates this invocation is due to an exception (or fault) or not.
* We could use the lowest-order bit because captbl is 4096-byte aligned,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could -> can

struct comp_info *comp_info = &comp_invstk_info->comp_info;

comp_info->captbl = (struct captbl *)((unsigned long)comp_info->captbl & (~1u));
return comp_info;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return on separate line

static inline struct comp_info *
thd_invstk_current(struct thread *curr_thd, unsigned long *ip, unsigned long *sp, struct cos_cpu_local_info *cos_info)
thd_invstk_current_fault(struct thread *curr_thd, struct cos_cpu_local_info *cos_info, unsigned long *ip, unsigned long *sp, unsigned long *in_fault)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remind me why we are passing cos_info through all of this, when we have global access to it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like the change of possible substance.

struct cos_cpu_local_info *cos_info)
{
struct invstk_entry *curr_entry;
struct comp_info * curr_comp_info;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"*" in the wrong place. Should be right aligned.

#define THD_REG_READ(op, reg_name) \
case THD_GET_FAULT_##op: \
*retval = t->fault_regs.reg_name; \
break; \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the ; to remove the redundant ;; below.

@@ -1,7 +1,7 @@
COMPONENT=llbooter.o
ASM_OBJS=s_stub.o
COMPONENT=llboot_comp.o
INTERFACES=
INTERFACES=fault_handler
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name choices: faults, faultmgr and of course fault_handler here. I actually like having the API name contain interface name as a prefix but again that's a personal preference, just wanted to mention that.

{
struct pt_regs fault_regs;

struct cos_aep_info *child_aep = boot_spd_initaep_get(spdid);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very interesting! It looks like the handler assumes the fault is always in the INITTHD of that component?

void
fault_comp_not_exist(unsigned long sp, unsigned long ip, unsigned long fault_addr, unsigned long fault_type)
{
PRINTLOG(PRINT_ERROR, "in component does not exist fault handler, fault happens in component:%u\n\n", (unsigned int)cos_inv_token());
Copy link
Member

@phanikishoreg phanikishoreg Jun 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This applies to all PRINT statements, grammar doesn't seem right!
I'd rephrase it as: \"Component Does Not Exist\" fault happened in component [%u] or something like that.

#ifndef FAULT_HANDLER_H
#define FAULT_HANDLER_H

void fault_div_by_zero(unsigned long sp, unsigned long ip, unsigned long fault_addr, unsigned long fault_type);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'd use word_t where possible.

@@ -1294,7 +1294,7 @@ static int __attribute__((noinline)) composite_syscall_slowpath(struct pt_regs *
vaddr_t entry_addr = __userregs_get3(regs);
invtoken_t token = __userregs_get4(regs);

ret = sinv_activate(ct, cap, capin, dest_comp_cap, entry_addr, token);
ret = sinv_activate(ct, cap, capin, dest_comp_cap, CAP_SINV, entry_addr, token);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sinv_activate with a type? I think you should leave sinv_activate as is. In the FAULTACTIVATE, use faultinv_activate or something and call an internal function that would be sinv_activate_type within inv.h

@@ -286,7 +286,7 @@ arcv_introspect(struct cap_arcv *r, unsigned long op, unsigned long *retval)
*/

static inline void
sinv_call(struct thread *thd, struct cap_sinv *sinvc, struct pt_regs *regs, struct cos_cpu_local_info *cos_info)
sinv_call(struct thread *thd, struct cap_sinv *sinvc, struct pt_regs *regs, struct cos_cpu_local_info *cos_info, unsigned long in_fault)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious why unsigned long in_fault? Wish we could use bool here. :-P

@@ -130,6 +131,7 @@ typedef enum {
typedef enum {
CAP_FREE = 0,
CAP_SINV, /* synchronous communication -- invoke */
CAP_FLT, /* communication of fault handler -- invoke*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

communication of fault handler? Not sure what that means.

/* thread id */
THD_GET_TID,
};
/* get regs */
THD_GET_FAULT_REG0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are fault regs different from regular regs? Even if they represent a "fault" context and not regular context, I think the registers are still same. My point is, FAULT_IP == THD.FAULT.IP in some sense. Perhaps instead of having different set of enums here, have some kind of fault_introspect that passes an in_fault flag and fetches fault_regs?

Copy link
Member

@phanikishoreg phanikishoreg Jun 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I don't think we should have REGX naming for IP and SP at the very least?

regs->sp, regs->bp);
PRINTK("Segment registers-> CS: %x, DS: %x, ES: %x, FS: %x, GS: %x, SS: %x, ESI: %x, EDI: %x \n", regs->cs, regs->ds, regs->es,
regs->fs, regs->gs, regs->ss, regs->si, regs->di);
PRINTK("Index registers-> EIP: %x, ESP: %x, EBP: %x\n", regs->ip, regs->sp, regs->bp);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, aren't SI and DI Index registers? Why this change?

thdid_t thdid = curr_thd->tid;
unsigned long ip, sp;
u32_t fault_addr = 0, errcode, eip;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think before you SINV into the fault-handler, we should first see if it is a kernel-level fault or user-level fault. I don't think you'd succeed, perhaps leads to double-fault if you tried anything intelligent if it is a kernel-level fault (obviously it depends on the type of fault.) but I think there is no way a user-level handler would handle kernel-level faults (bugs) obviously. Right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be wrong here about "obviously it depends on the type of fault".
I need to read myself but will the user-level be able to resolve any kind of kernel-level fault, even just a page-fault? I doubt it!
In my opinion, all kernel level faults are our kernel BUGS in someway or the other.

Copy link
Member

@phanikishoreg phanikishoreg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was missing Composite a little (:-P) and found a pending notification in my github. Sorry it took too long.

Good job with the fault-handling stuff, just a few comments from me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants